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CLONED flOMAN SERUM ALBUMIN GEME 



This invention relates to a method for synthesising 
human serum albumin gene. This invention further relates 
to a plasmid containing a cloned human serum albumin 
gene and a microorganism transformed with such a 
plasmid. 



Human serum albumin (sometimes referred to 
hereinafter as HSA) is the major protein component of 
plasma. The protein is produced in the liver and is 
primarily responsible for maintaining normal osraolarity 
in the bloodstream. It also is capable of binding and 
transporting various small molecules via the blood. 

HSA is administered in various clinical situations. 
Shock and burn victims, for instance, usually require 
doses of HSA to restore blood volume and thus ameliorate 
some of the symptoms associated with trauma. Persons 
suffering from hypoproteinemia or erythroblastosis 
fetalis also are likely to require treatment with serum 
albumin. 

To date, HSA is produced primarily as a by-product 
from the fractionation of donated blood. A drav^ack to 
this is that the cost and supply of blood can vary 
widely. The blood also may contain undesirable agents 
such as hepatitis virus. It therefore would be 
advantageous to develop an alternative source of HSA. 

It accordingly is an object of this invention to 
produce human serum albumin in microorganisms. It is a 
further object of this invention to so produce HSA 
economically. It also is an object of this invention to 
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develop a cloning procedure that can be applied to other 
serum proteins. 

Brief Description of the Figures 

Figure 1 shows a partial restriction map of a full- 
5 length HSA cDNA clone isolated by the procedures 

described herein. 

Figure 2 shows the DNA sequence of the 5'+3' strand 
of the non-coding emd coding regions of the full length 
HSA cDNAr as well as the aunino acid sequence specified by 
the DNA sequence. 

Figure 3 shows an Ajgo profile of sucrose gradient 
fractions of mRNA. Fraction group B was used ais the 
template in the synthesis of HSA cDMA. 

Figure 4 shows pGX401, a recombinemt plasmid 
15 containing a full length HSA cDNA insert. 

Figure 5 shows the DNA sequence in the region of 
codon 97 for HSA sequences derived from three different 
human livers. 

20 Acxxxrding to one aspect of the present invention, we 

provide a synthetic himan seom albunin gene. The tern "synthetic" 
as xjsed. herein should be iinderstood to iiKzlude ENA sequences 
produced by use of xeocrrfcriant ENA techniques and/or chemical 
synthesis. 

25 In accocdanoe with the present invention, a novel human senin 

albixnin (HSA) gene has been doned and bacterial eiqpressicn 
of the gene is described. Itie nucleotide sequence of the full 
length HSA gene and the amino acid sequence of the polypeptide 
specified by that gene also are r^xsrted herein. 

30 Ite procedure more fully described heireinafter vMx±. 

has been used to prepare an HSA-producing microorganism can 
be divix3ed into the following stages: (1) obtaining HSA ni?NA 
from a suitable source, e.g. by recovery and isolation of the 
HSA mRNA from HSA producing cells, (2) in vitro synthesis of 

35 corplementary DNA (cDt4A), using the mPNA as a tenplate and conversion 

of the ciXlA to the double-stranded form and (3) insertion 
of the double-stranded 
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cDNA into a suitable cloning vector and transformation of microbial 
cells with that cloning vector. The procedures described 
herein resulted in the preparation of a "full-length" cloned 
HSA cDNA. 

Eukaryotic genes are contained in the chroiaosomal 
DNA of cell nuclei. This chromosomal DMA exists in a 
compact nucleoprotein complex called chromatin. 
Bukaryotic chromosomal DMA contains intervening sequences 
(introns) within the coding sequences (exons), which 
would not permit correct expression in bacteria. For 
this reason a preferred method for producing contiguous 
coding blocks of a particular protein involves the use of 
messenger RNA (mRNA). Messenger RMA has a ribonucleotide 
sequence corresponding to the gene of interest without 
the introns and conveniently can be recovered from 
eukaryotic cells that produce the protein specified by 
the gene. 

Human serum albumin mRNA can be recovered in useful 
quantities from human liver cells. The HSA mRNA produced 
by the liver cells is complementary to one of the two 
strands of the BSA gene and may be employed as a template 
forr the synthesis of complementary DNA (cONA) as herein- 
after described. To effectively utilize the mRNA for the 
synthesis of cDNA, it advantageously is recovered from 
the cells in relatively pure form. The guanidine thio- 
cyanate/guanidine hydrochloride extraction procedure 
described by McCandliss et al.. Methods in Enzymolog y 
79:51 (1981), advantageously may be used to recover and 
purify the HSA mRNA. RNA is inherently less stable than 
DNA, and is particularly subject to degradation by ribo- 
nucleases that are present in the cells. Therefore, mRNA 
recovery procedures generally employ means for rapidly 
inactivating any r ibonucleases which are present. 

In general, recovery of total RNA is initiated by 
disrupting the cells in the presence of a ribonucleaae- 
inactivating substance. Disruption of the cells may be 
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accomplished by subjecting the cells to a lysing reagent, 
freezing/thawing, or mechanical disruption; preferably a 
combination thereof. A mixture of guanidine thiocyanate 
and a reducing agent, such as mercaptoethanol, has been 
5 found to function effectively as a ribonuclease 
inactivator (McCandliss, et al., supra). 

After disruption of the cells, the solid cell debris 
is removed, e.g. by centrifugation, and the RNA is 
precipitated from the resulting clarified solution. 
10 Precipitation is effected by known techniques, such as 
adding a water-mi scible alcohol, e.g. ethanol, to the 
solution in a precipitating eunount, fOie RNA then is 
resuspended in a guanidine hydrochloride solution and 
precipitated with ethanol for two successive cycles. At 

1 5 this point the RNA is undegraded and free of proteins and 

DNA. 

The next step is the separation of mRNA from the 
total precipitated RNA. Human serum albumin mRNA is 
polyadenylated, therefore, it readily can be separated 
20 froB non-ad enylated RNA by affinity chromatography with 
oligodeoxythymidylate (oligo dT) cellulose (Aviv, H., et 
al., Proc. Natl, Acad. Sci. OSA 69; 1408 (1972); 
McCandliss, et al., supra ) . Total RNA can be applied to 
a column in an approximately 0.5 M NaCl containing solu- 

2 5 tion. Under these conditions only poly A+ RNA binds to 

the oligo dT cellulose and can be removed specifically by 
washing the column in a salt free solution. 

To enrich the preparation for RSA mRNA, the poly 
A^RNA can be fractionated according to size by sucrose 

30 gradient centrifugation. Activity of the RNA in the 

various gradient fractions can be verified by in vitro 
translation in a reticulocyte lysate (Pelham, B., et al . 
Eur. J. Biochem, 67:247 (1976)) and by electrophoretic 
analysis of the protein products (Laeromli, a.. Nature 

35 227:680 (1970)). 
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Once a poly A+RNA fraction able to synthesize 
proteins the size of HSA has been isolated, it can be 
used to provide a template for cDNA synthesis. This 
procedure involves enzyraatically constructing double- 
stranded DNA, which has a nucleotide base pair sequence 
identical to the coding sequence of the original chromo- 
somal gene. The cDNA does not contain any noninforma- 
tional segments (introns) within the coding region which 
might be present in the eukaryotic gene, and thus can 
ultimately be transcribed and translated in prokaryotic 
systems . 

Synthesis of HSA cONA employs the enzymes reverse 
transcriptase, Klenow fragment of DNA polymerase I and SI 
nuclease (Kacian, D., et al., Proc. Nat. Acad. Sci. OSA 
73:2191 (1976); McCandliss, R., et al.. Methods in 
Enzymoloqy 79^ p. 601 (1981)). Reverse transcriptase 
catalyzes the synthesis of a single strand of DNA from 
deoxynucleoside triphosphates on the raRNA template. The 
poly r(A) tail of the mRNA permits oligo (dT) (of about 
12 to 18 nucleotides) to be used as a primer for cDNA 
synthesis. The use of a radioactively-labelled deoxy- 
nucleoside triphosphate facilitates monitoring of the 
synthesis reaction. Generally, a * ^^p.^jQ^^^^i^^^g 
deoxynucleoside triphosphate advantageously may be used 
for this purpose. The cDNA synthesis generally is 
conducted by combining the mRNA, the deoxynucleoside 
triphosphates, the oligo (dT) and the reverse transcrip- 
tase in a buffered solution. This solution is incubated 
at an elevated temperature, e.g., about 40-50'C, for a 
time sufficient to allow formation of the cDNA copy, e.g. 
about 5-20 minutes. The conditions of the reaction are 
.essentially as described by Kacian, O.L., et al., supra . 
After incubation, disodium ethylenediaminetetraacetic 
acid (hereinafter EDTA) is added to the solution, and the 
solution Is extracted with phenol :chloroform (1:1 by 
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vol.). The aqueous phase is advantageously purifed by 
gel filtration chromatography r and the cDNA-mRNA complex 
in the eluate is precipitated with alcohol. 

The mRNA can be selectively hydrolyzed in the 
presence of the cDNA with dilute sodium hydroxide (about 
0.1 M) at an elevated temperature, e.g., about 60-80 
for about 15-30 minutes. Neutralization of the alkaline 
solution and alcohol precipitation yields a single- 
stranded cDNA copy. 

The single-stranded cDNA copy has been shown to have 
a 5' -poly (dT) tail, and to have a 3' terminal hairpin 
structure, which provides a short segment of duplex DNA 
(Bfstratiadis, A., et al.. Cell , 7, 279 (1976)). This 3' 
hairpin structure can act as a primer for the synthesis 
of a complementary DNA strand. Synthesis of this 
complementary strand is conducted using the Klenow 
fragment of DNA polymerase I {Klenow, H., et al., Eur . J . 
Biochem. , 22r 371 (1971)) in a reaction mixture 
containing the deoxynucleoside triphosphates. The duplex 
cDNA recovered by this procedure has a 3' loop, resulting 
from the 3* hairpin structure of the single-stranded cDNA 
copy. This 3' loop can be cleaved by digestion with the 
enzyme, SI nuclease, using essentially the procedure of 
McCandliss et al.. Methods in Enzymology 2±^SQ^ (1981). 
The SI nuclease digest may be extracted with phenol- 
chloroform, and the resulting cDNA precipitated from the 
aqueous phase with alcohol. 

Tfae intact double-stranded DNA (about 2000 ba^e 
pairs) corresponding to a human serum albumin gene can be 
isolated by, for example, sucrose gradient centrifuga- 
tion, using the procedure of McCandliss supra p. 51. In 
order to determine the sizes of the DNA in the sucrose 
gradient, aliquots of the gradient fractions are electro- 
phoresed in a polyacrylaraide gel with molecular weight 
markers. The resulting gel is first stained with 
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ethidlum bromide to visualize the markers and then auto* 
radiographed to detect the radioactive cDNA. The frac- 
tions of the gradient containing DNA molecules larger 
than 1000 base pairs are pooled and the DNA is precipi- 
5 tated with ethanol. 

For purposes o£ amplification and selection, the 
double-stranded cDNA gene prepared as described above is 
generally inserted into a suitable cloning vector, which 
is used for transforming appropriate host cells. 

1 0 Suitable cloning vectors include various plasraids and 

phages, with plasmids being preferred in this case. The 
criteria for selecting a cloning vector include its size, 
its capability for replicating in the host cells, the 
presence of selectable genes, and the presence of a site 

15 for insertion of the gene. With respect to its size, the 
vector is advantageously relatively small, to permit 
large gene insertions, and so as not to divert large 
amounts of cellular nutrients and energy to the 
production of unwanted macromolecules . The vector also 

2 0 includes an intact replicon which remains functional 

after insertion of the gene. This replicon preferably 
directs the desired mode of replication of the plasmid, 
i.e., multiple copies or a single copy per cell, or a 
controllable number of copies per cell. Genes specifying 
25 one or moce phenotypic properties, preferably antibiotic 
resistance, facilitate selection of transf ;?nants. The 
insertion site is advantageously a unique restriction 
site for a restriction endonaclease. A cloning vector 
meeting all of these criteria is the plasmid pBR322. The 

3 0 cDNA can be conveniently inserted into this plasmid by a 

homopolymeric tailing technique. Homopolymer tails are 
added to the 3'-hydroxyl groups of the human serum 
albumin double-stranded cDNA gene, by reaction with an 
appropriate deoxynucleoside triphosphate, in the presence 
35 of terminal deoxynucleotidyl transferase. The plasmid is 
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opened by digestion with the appropriate endonuclease,and 
complementary homopolymer tails are added to the 3** 
hydroxyl groups of the opened plasmid, using the 
homopolymeric tailing technique. Appropriate reaction 
5 conditions have been described £or the addition of dC 
residues to ds cDNA (NcCemdliss, R«, et al., page 601 
supra ; Roychoudhury, et al.r Nucleic Acids Research 
3:101 (1976)) and of dG residues to PstT treated pBR32 2 
(Maeda, S., Methods in Bnzymoloqy 79:607 (1981)). In a 

10 preferred embodiment r however, the molar excess of dXTFs 
to 3' ends is in the range of 3000 to 5000. Progress of 
the reactions is monitored until the chain length is 
approximately 15. The tailed cDMA and plasmids are 
recovered, e.g., by phenol extraction followed by alcohol 

15 precipitation. The homopolymeric ends of the two DHAs 
are complementary and will anneal together under 
appropriate conditions to yield a recombinant plasmid 
containing the HSA gene (Maeda, S., Methods in Enzymoloqy 
79:611 (1981)). 

2 0 A suitable strain of B.coli may be transformed with 

this recombinant plasmid, using essentially the method of 
Lederberg, Bacteriology 119:1072 (1974) and be 
maintained indefinitely. 

Generally, several hundred to several thousand 
25 clones are produced by these procedures and can be 
screened for the presence of the HSA gene with, for 
example, rat serum albumin cDHA. A nick translated 
(Haniatis, T., et al., Proc. Natl. Acad. Sci. aSA 72:3961 
(1975)) rat cDNA having 85% homology with human cDNA can 

3 0 be used to hybridize to plasmid cDKA attached to 

nitrocellulose filters (Grunstein, M., et al., Proc . 
Natl. Acad. Sci, USA 72:396 (1975), Southern, E.M. J. 
Mol, Biol, , 98:503 (1975)). In this procedure, DNA from 
each colony (or from groups of colonies) is fixed to 
35 discrete zones of a nitrocellulose filter and denatured. 
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Alternatively, the DNA can be electrophoresed in a gel 
prior to fixing on a filter, A solution of the 
radioactively labeled rat cDNA is applied thereto under 
hybridizing conditions, Unhybridized rat cDNA is washed 
5 from the filter, and colonies containing DNA to which the 
rat CDNA hybridized are identified by autoradiography. 
One positive clone was identified but found to be an 
incomplete HSA cDNA by DNA sequencing. A portion of this 
HSA cDNA was then nick translated in order to rescreen 

10 the entire bank of clones. Ninety positive hybridization 
signals were thus obtained. 

Positive clones may be cultivated on suitable growth 
media to obtain ample quantities of cells from which to 
extract the plasraid DNA. The plasmid DNA is extracted, 

IS using conventional techniques, such as disruption of the 
cells, followed by phenol extraction, and alcohol 
precipitation. The plasmid and chromosomal ON As may be 
separated, e.g. by electrophoresis or cesium chloride 
equilibrium centrifugation. Plasraid DNA containing 

2 0 inserts of about 1500 to 2000 base pairs are selected for 
further characterization. 

The cloned gene can be excised from the plasmid DNA 
and then characterized by sequencing analysis (Sanger, 
P., et al., Proc. Natl. Acad. Sci USA 74:5463 (1977); 

25 Maxam, A., et al., Proc. Natl. Acad. Sci. USA 74:560 
(1977)). 

By these procedures a prepro-HSA clone has been 
isolated. An E. coli HB101 culture transformed with the 
plasmid containing this prepro-HSA gene has been 
30 • deposited with the U.S. Department of Agriculture 
Northern Regional Research Laboratory in Peoria, 
Illinois, as NRRL No. B-15784. A diagnostic partial 
restriction map of this HSA gene insert is shown in 
Figure 1 of the drawings and Figure 2 shows the 5' — >3 ' 
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strand of the non-coding and coding regions r along with 
the amino acid sequence specified by the gene. 

The cloned pcr^iro-HSA ending sequence consists of 2050 base pairs 
excluding the oligo dC tails added to the cDNA. The gene 
5 has noncoding regions at the 5* end (base pairs 1*31) and 
at the 3' end (base pairs 1858-2050) • The 5' end of the 
coding region (32- 103 base pairs) includes a 24 
amino-acid leader (an 18- amino-acid "pre" sequence 
followed by a 6-amino-acid "pro" sequence) and the mature 
10 human serum albumin protein is specified by the region 
from base pair number 104 to base pair number 1858. 

As used in Figure 2 and elsewhere herein, the 
abbreviations have the following steuidard meaning; 
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GLN » glutamine 
ASN = asparagine 
It will be appreciated that because of the degeneracy of 
the genetic code, the nucleotide sequence of the gene can 
5 vary substantially. For example, portions or all of the 
gene could be chemically synthesized to yield ONA having 
a different nucleotide sequence than that shown in Figure 
2, yet the amino acid sequence would be preserved, 
provided that the proper codon-aroino acid assignments 

10 were observed. Having established the nucleotide 

sequence of the human serum albumin gene and the amino 
acid sequence of the protein, the gene of the present 
invention is not limited to a particular nucleotide 
sequence, but includes all variations thereof as 

15 permitted by the genetic code. 

It is believed that the amino acid sequence set 
forth in Figure 2 and claimed herein represents a genomic 
HSA allele that is widespread in the human population, in 
contrast to the sequences previously published in the 

20 scientific literature. Polymorphism is known for BSA. 

Protein electrophoresis has revealed over twenty genetic 
variants of HSA (Weitkamp et al., Ann. Hum. Genet. 
London 36:381 (1973)). Two differing amino acid 
sequences have been reported previously. See Lawn, R.N« , 

25 et al., Nucl. Acids Res. 9:6103 (1981) and Dugiaczyk, 
A., et al., PNAS 79:71 (1982). The DNA sequence of 
Figure 2 differs from each of these published sequences. 
Although some of the differences occur in third base 
position of codons or in the noncoding regions, and as 

30 such do not cause amino acid changes, conflicting 

nucleotide sequence data suggest different amino acids at 
positions 97 and 396. In Figure 2, the amino acid 
represented by codon 97 (GAG) is glutamic acid. The scune 
was reported by Lawn, et al. , supra , Dugiaczyk, however, 

3 5 reported that codon to be GGG (glycine). Codon 396 in 
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Figure 2, also is designated GAG (gluatmic acid). 
Dugiaczyk reported the same; however. Lawn reported codon 
396 to be AAG (lysine). Thus, each of the three DNA 
sequences would encode a different polypeptide. Example 
5 IV below sets forth the procedures followed to determine 
that these differences represented true protein 
polymorphism and not merely experimental artificats. 

The present invention has been described in 
connection with the use of coli as the bacterial host 
10 for recombinant DNA containing the HSA gene, but skilled 
molecular biologists will appreciate that other gram- 
negative bacteria, such as Pseudomonasi gram-positive 
bacteria, such as Bacillus ; higher unicellular organisms, 
such as yeasts and fungi, and mammaliaui cells can be 

1 5 employed for cloning and/or expression of the BSA gene. 

The invention is further illustrated by reference to 
the following examples, which are not intended to be 
1 imiting. 

EXAMPLE I 

2 0 Isolation of HSA mRNA from Human Liver Tissue 

Messenger RNA (mRNA) was isolated from human liver 
tissue taken from a 10-year-old accident victim. 
Extreme care was taken throughout the procedures to avoid 
ribonuclease contamination of the mRNA preparation. 
25 These measures included the use of new, sterile labora- 
tory glassware, treatment of solutions with diethylpyro- 
carbonate when appropriate, followed by autoclaving, 
keeping the preparation cold when possible and using 
gloves to avoid contact of the preparation with skin. 

3 0 Frozen human liver tissue (10.5 grams) was homo- 

genized in 210 mis lysis solution (4M guanidine thiocya- 
nate/O.IM Tris-HCl, pH 7.5/0. IM 2-mercaptoethanol ) using 
a Virtis homogenizer. Cellular debris was pelleted by 
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centrif ugation at 8750 rpm, 4*C, for 10 minutes in a 
Sorvall GSA rotor, and the supernatant was transferred to 
a new centrifuge bottle. To the supernatant were added 
0.04 volume IM acetic acid and 0.5 volume 95% ethanol. 
5 After 2 hours at -20*C, the mixture was centrifuged at 
7500 rpm, 10 minutes, 4*C and the pellet resuspended in 
5 0 mis wash solution (6M guanidine hydrochloride/1 OmM 
Naj'EDTA, pH 7. 0/1 OmM dithiothreitol . ) Centrif ugation at 
5500 rpm, 10 minutes, pelleted particulate debris, and 

10 the supernatant was transferred to a new centrifuge 

bottle. To the supernatant were added 0.04 volume IM 
acetic acid and 0.5 volume 95% ethanol. After 2 hours at 
-20*C, the mixture was centrifuged at 7200 rpm 20 
minutes. The pellet was resuspended in 20 mis wash 

15 solution, and 0.04 volume 1M acetic acid and 0.5 volume 
95% ethanol were added. The mixture was kept at -20 *C 
for 12 hours, then centrifuged at 8,000 rpm for 10 
minutes at 4*C in a Sorvall SS-34 rotor. The pellet was 
resuspended in 15 mis sterile distilled HjO (dH.^0} and 

20 extracted with an equal volume of (4:1) chloroform: 

butanol. The aqueous phase was transferred to a fresh 
tube and 0.1 volume 2,4 M sodium acetate and 2.5 volumes 
95% ethanol were added. After 2.5 hours at -20'C, the 
RNA was pelleted by centrif ugation and the pellet was 

25 resuspended in 2 mJs sterile dH^O). A total of 19.2 mg 
RNA was recovered. 

mRNA was then separated from the total RNA using 
generally, the oligo(dT) -cellulose affinity 
chromatography procediire described in Aviv et al.. supra 

30 and McCandliss, et al., supra. A column of 5 qraims 

oligo(dT) -cellulose was washed with one column volume 
0.1M NaOH to denature any ribonuclease present, then 
equilibrated with high salt buffer (lOmM Tris-BCl, pB 
7.4/0.5M NaCl/0.5% sodium dodecyl sulfate). The total 

35 RNA preparation, dissolved in two mis dH-O above, was 
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heated at 70 'C for 1 minute, then cooled on ice to room 
temperature. Next, 0.1 volume 5M NaCl, 0.04 ml 0.5H 
Tris-HCl, pH 7.5, and 0.1 ml 10% sodium dodecyl sulfate 
(SDS) were added to the RNA. 8 mis high salt buffer were 
5 then added to the RNA and the solution was applied to the 
column with a flow rate of about 10 drops/minute. After 
the sample had passed through, unbound RNA was washed 
from the column with high salt buffer. Fractions (1/2 ml 
each) were collected and the optical density at 260 nm 

^0 (^260 ) each fraction was measured in a 

spectrophotometer. The column was washed until the A^^q 
readings of fractions dropped below 0.05. Undesired RNA 
was further washed from the column with low salt buffer 
dOmH Tris-HCl, pH 7.4/0.2H NaCl/0.1% SOS) and fractions 

15 were collected as above until the A2go had dropped to 
0.05. 

Next, the mRNA was eluted from the column with 
elution buffer {lOmM Tris-HCl, pH 7.4/1mM BDTA/0.1% SDS) 
and 1 ml fractions were collected until the Aj^g was less 

20 than 0.05. The first 15 fractions (thdse having the 
highest 0D2gQ readings) were pooled and the mRMA was 
precipitated by adding 0.1 volume 2.4H sodium acetate and 
2.5 volumes 95% ethanol, and placing at *20*C for 12 
hours. The eluted mRNA was then pelleted by centrifuga- 

25 tion and resuspended in 800 m1 elution buffer. After 
heating the resuspended pellet at 70*C for 90 seconds 
then cooling on ice; 0*1 volume 5M NaCl and 0.05 volume, 
10% SDS were added. 

The elated mRNA prepared above was then farther 

30 purified by passage over a second oligo(dT)-celluose 

column. A column containing 0.1 gram oligo(dT) cellulose 
was washed with NaOH, then with high salt buffer as 
previously described. The RNA was applied to the column 
and fractions were collected with high salt, low salt, 

3 5 and elution buffers as with the first column. The peak 
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fractions from the elution buffer step were pooled and 
the twice-purified mRNA was precipitated and pelleted as 
before. 

The mRNA was then size-fractionated on a 12-ml 
5 sucrose gradient as described in McCandliss et al.. 

Methods in Bnzymoloqy r 79 f pp. 56-58. A 5-20% sucrose 
gradient was prepared in gradient buffer (0,02M sodium 
acetate, pH 5.6) and chilled at 4*C for 3 hours. lOOug 
of the mRNA was resuspended in lOOul gradient buffer, 

10 heated at 80*C for 2 minutes r quick-cooled in an ice 

bath, then layered on top of the gradient. A second 5- 
20% gradient had ^ coli 16 and 23S rRNA (lOOug total) 
loaded on it to serve as molecular weight markers. 

The two gradients were centrifuged in a Beckman 

15 SW40 rotor at 38,000 rpm for 12.5 hr at 4"C. Fractions 

of about 0.5 ml were then collected and the ^2^0 measured 
(fraction #1 is that collected from the bottom of the 
gradient tube.) The peak was divided into 6 groups 

of fractions, groups A through P as shown in Figure 3. 

2 0 The fractions in each group were pooled and the mRNA 

precipitated with 0.1 volume 2.4 M sodium acetate and 2.5 
volumes 95% ethanol. 

Fraction groups containing mRNA which encodes 
protein of the size expected for HSA were identified by 

2 5 in vitro translation using a rabbit reticulocyte lyi-ate 

kit (available from Bethesda Research Laboratories and 
used according to manufacturer's instructions) supple- 
mented with ^^S methionine. • A reaction mixture for each 
fraction group contained the components necessary for 
30 translation of the mRNA into radioactively-labeled 

proteins which were visualized by electrophoresis on a 
12.5% polyacrylamide/SDS gel, followed by f luorography . 

The fluorogram showed a prominent protein band of 
the size expected for HSA (68,000 daltons) among the 

3 5 translation products of fraction groups B and C, Group 8 
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had a much lower percentage of proteia products in 
undesirable low molecular weight range so the mRNA in 
group B was chosen for use as a template in the. synthesis 
of CDNA. 

BXAMPLB II 
Synthesis of HSA cDNA 

Generallyr the cDNA synthesis procedure of 
McCandliss et al*. Methods in Enzymology f 79, pp. 601- 
607 (1981) was used. Incorporation of a radioactively 
labeled deoxynucleotide allowed monitoring of the 
synthesis and circulation of yields at each step. 

The first strand of cDNA was synthesized on the 
mRNA template, using oligo-dT as a primer, as follows. 



Prepared mix and kept on ice: 



0.5 M Tris-eCl, pH 8.3 


20ul 


1.4 H RCl 


lOul 


0.25H HgCl2 


8ul 


0.05M dATP, pB 7.0 


2ul 


O.OSM TTP, pB 7.0 


2ul 


0.05M dCTP, pB 7.0 


2ul 


0.05H dGTP, pa 7.0 


2yl 


0.0 1M dithiothreitol 


4ul 


sterile distilled B2O 


45ul 


aqueous label, a^^p-dCTP (lOuCi/vD 


5ul 




lOOul 
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Added remaining components: 



oligo(dT) x2-x8(250ug/ml) 


20ul 


actlnomycin 0 (500 ug/ml, aqueous) 


16m1 


lOpg mRNA, "B" fraction 


20ul 


sterile dHjO 


37ul 


*AMV reverse transcriptase (16u/m1) 


7ul 


Total volume: 


200ul 



* Avian myeloblastosis virus (AMV) reverse 
transcriptase is kept at -80 and thawed briefly to add 

1 0 as last component 

The reaction mixture was kept on ice 5 minutes and 
2p1 were removed and counted in ASC scintillation fluid 
in order to determine the specific activity of the dCTP. 
The reaction mixture was then incubated 10 minutes at 

15 46*C. 20vl 0.2H BDTA pH 8.0 was added to stop the 
reaction, and the mixture was then extracted with an 
equal volume (1:1) phenol: chloroform, 

0.14 volume 80% glycerol was added and sample was 
chromatographed on a 0.7 x 17 cm. Sephadex G-100 column. 

20 Once the sample had entered the column, GlOO buffer (lOmM 
Tris-HCl, pH 8.0/lmM EDTA/IOOmM NaCl) was added to the 
column and 5-drop (about 275 ul) fractions were collected. 
The radioactive fractions were "Cerenkov counted" and the 
cDNA fractions comprising the peak counts per minute were 

2 5 pooled. The mRNA/cDNA hybrids were precipitated by 

adding 0.1 volume 2.4M sodium acetate and 2.5 volumes 95% 
ethanol, placing in a dry ice/ethanol bath for 30 
minutes, then pelleting by centrifugation at 10,000 rpm, 
4'C, for 20 minutes. The pellet was resuspended in BOOul 
30 O.lM NaOH and heated at 70*C for 20 minutes to hydrolyze 
the RNA, leaving single-stranded cDNA. 30ul 1M HCl were 
added to neutralize the solution. The DNA was 
precipitated by adding 5iig tRNA, 1/10 volume 2.4M sodium 
acetate, and 2.5 volumes 95% ethanol, placing in a dry 



0206733 

18 

ice-ethanol bath 10 minutes, and centr if aging in a 
raicrofuge 10 minutes at 4*C, 

The pellet was resuspended in the following mix: 
40 vl 0.5M potassium phosphate, pB 7.4 
8ul 0.25M MgClj 
2ul O.IM dithiothreitol 
iMl 0.05M dATP, pH 7,0 
lul 0.05M dCTP, pa 7.0 
lul 0.05M dGTP, pe 7.0 
lul 0.05M TTP, pH 7.0 
124ul sterile dH2 0 
178ul 

Next, added 22wl DNA polymerase I Klenow fragment (5u/ul, 
available from Boehringer-Maanheim. ) 

The reaction mixture was then incubated in a 15 
water bath for 12 hours. 20m1 0.2M EDTA pH 8.0 was added 
to stop the reaction and the mixture was extracted with 
an equal volume (1:1) phenol: chloroform. 0.14 volume 
glycerol was added to the aqueous phase. 

The sample, which now contains double-stranded 
cDKA, was run over a Sephadex G100 column and the peak 
cDNA fractions were pooled and precipitated as before. 
The double-stranded DNA has a 3' "hairpin loop" as 
previously described, which was removed with SI nuclease 
as follows. The pellet was resuspended in 72 ul sterile 
distilled water and then 18 ul 5X SI buffer (1M 
NaCl/0.25M sodium acetate, pH 4.5/5mM 2nS0^/2.5% 
glycerol) were added. An enzyme mix was prepared by 
adding 2.5 ul (50 units) of SI nuclease (20ug/ul) to 47.5 
ul IX SI buffer. lOul of enzyme mix was added to the 
90ul DNA solution then incubated at 37 *€ 20 minutes. 
Addition of 20 ul 0.2M sodium EDTA stopped the reaction, 
and the reaction mixture was extracted with an equal 
volume (1:1) phenol:chlorof orm. The aqueous phase was 
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loaded onto a 5-25% sucrose gradient and spun at 38,000 
rpra 17.5 hours 5*C in an ultracentrif uge. 

One-ml fractions were collected and "Cererikov 
counted." Fractions were pooled with fractions 1-6. 7-9. 
and 10-12 comprisinq the 3 oools. Fraction #1 was the 
fraction taken from the bottom of the qradient. DNA was 
precipitated bv adding 0.1 volume 2,4M sodium acetate, 1- 
2 ug tRNA, and 2.5 volumes 95% ethanol to each pool, then 
placing them at -20 *C overnight. The DNA was pelleted by 
centr if ugation at 25K for 30 minutes at 4*C. After 
slightly dessicating pellets , the OHA from each pool was 
resuspended in 200yl dH20 and precipitated again with 
ethanol and sodium acetate. Pellets were resuspended in 
22yl dH20 and spun in a microfuge 5 minutes to pellet 
insoluble matter. 2iil of each cDNA-containing 
supernatant were analyzed by electrophoresis on a 6 % 
polyacrylamide gel. Autoradiography of the gel showed 
that the DNA in the pool of fractions 1-6 had an average 
size of 1100 base-pairs (bp) and included DNA in the 200 
bp range and this pool was chosen for addition of "polyC 
tails" to the 3' ends of the cONA, using r generally, the 
homopolymeric tailing procedure described in HcCandliss 
et al., page 601 et seq . , supra . A 5000 molar excess of 
dCTP over 3* cDNA ends was found to give good results. 
The reaction mixture was as follows: 

20ul cDNA (about 43 ng) 

dCTP (645 pmol, lyophilized) 

2.4ul 10X TdT buffet* 

I.611I d HoO 

24.0)il 

*10X TdT buffer 1.4M potassium cacodylate/0. 3M Tris- 
HCl, p« 7.0/IOraM CoCl2/1mM DTT) 

The reaction mixture was preincubated to 37*C for 2 
minutes, 2ul were removed for use in calculations, then 
2ul (6.66 units) P-L Biochemicals terminal deoxynucleo- 
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tidyl transferase were added and incubation at 37*C was 
continued for 5 minutes. Calculations based on 
incorporation of dCTP indicated that the 3*ends of the 
cDNA now carried "polyC, tails" an average of 14 nucleo- 
. 5 tides in length, 80wl T.B. buffer (lOmM Tris-HCl, pH 

7.6/1mM EDTA) were added to the DNA and the solution was 
extracted with an equal volume of (1;1) 

phenol :chloroforro. The organic phase was then retracted 
with lOOMl dH20 and the two aqueous phases were 
10 combined. 

The entailed double- stranded cDKA was then annealed 
to plasmid pBR322 DNA which had been linearized with the 
restriction endonuclease FstI, then "G-tailed" by the 
homopolymeric tailing method. The complementary single- 
15 stranded C and G "tails" will anneal, producing 

recombinant plasmids with cDNA inserts at the PstI site* 
200ul cONA, C-tailed (39.2 ng) 
10,5ul pBR322-PstI, G-tailed (302 ng ) 
93ul 10X buffer* 
20 626.5 ul dSjO 

930yl 

The reaction mix was placed in an insulated water 
bath at 70'C. The bath was then transferred to a 37 
room and allowed to cool slowly to 37'C overnight, then 
25 transferred to room temperature, where the bath cooled to 
30 *C over several hours. The reaction mixture was then^| 
stored at 4*C. 

*(10X annealing buffer » 1 .5H NaCl/IOOmH Tris-HCl, 
pe7.5/10mM BDTA) 

3 0 E. coli BB101 cells were made competent for 

transformation by known calcium chloride treatment 
procedures. 200ul aliquots of competent HB101 cells were 
each combined with 40ul of the annealing reaction mixture 
and kept on ice 20 minutes, then heat-shocked at 42*0 for 

35 2 minutes, 2.8 mis Luria broth were added to 
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each tube and incubated at 37 *C for 1 hour. The tubes' 
contents were aliquoted (1/2 ml aliquots) into tubes 
containing Luria broth plus 0.7% agar, and then were 
poured onto Luria broth-agar plates containing 25wg/ml 
5 tetracycline and incubated at 37 until colonies 
appeared. 

Only those cells transformed by pBR322 (with or 
without a cDNA insert) can grow on tetracycline plates. 
Approximately 2500 transformant colonies grew on the 
1 0 plates. 



EXAHPLB III 
Isolation of a Full-Length HSA cDNA 

The transformants were initially screened with a 
rat serum albumin (RSA) cDNA fragment. The RSA cDNA 

15 fragment was obtained from a pBR322 plasmid containing a 
2000 bp RSA cDNA insert. This recombinant plasmid is 
similar to, but contains a longer cDNA insert than, the 
plasmid prAlbl described in Proc. Nat ' 1 . Acad. Sci. PSA , 
76, 4370 (1979). A 1480 bp rat serum albumin (RSA) 

2 0 fragment was isolated by digesting the plasmid carrying 
the RSA cDNA with the restriction endonuclease BstB II 
( all restriction endonucleases used in these exsunples 
were used according to manufacturer's specifications.) 
The fragment was then radioactively labeled with a^Zp by 

25 the "nick translation" procedure (Maniatis et al. PNAS 
OSA^ 72:3961 (1975)). 

About 80 10-ml cultures of individual transformants 
were grown and plasmid DNA was isolated by known plasmid 
"mini-prep" procedures. The partially purified plasmid 

30 DNAs were subjected to electrophoresis on 0.8% agarose 
gels. The DNA was transferred from the gels to 
nitrocellulose filters using the "Southern blotting" 
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technique (Southern, E.M. J, Molec> Biology 98 , 503 
{1975)), 

The nitrocellulose filters were immersed for 2 
hours at 42*C in prehybridization solution (50% 
5 formainide/5X SSC*/0.05M NaPO^ , pff 6.5/5X 

Denhardt' s*/100iig/ml salmon sperm DNA) . The filters were 
then transferred into hybridization solution (50% 
formaraide/tO% dextran sulfate/5X SSC/20mM NaPO.,, pe 
6.5/lX Denhardt' s/50wg/ml salmon sperm DNA.) The nick- 

10 translated 1480bp RSA fragment prepared above was heated 
at 100'C for 5 minutes, then quick cooled on ice, and 
this probe was added to the hybridization solution at 2 X 
10^ cpm probe per ml of solution. The filters were 
incubated in the hybridization solution at 42*C for 18 

15 hours, then washed twice in 2XSSC and once in 0.1X SSC at 
room temperature • 

Autoradiography of the filters revealed non- 
specific hybridization of the probe to all plasmid DNAs . 
Therefore, several Southern blot filters were washed in 

20 2XSSC at various temperatures from 65 to 80 'C. DNA 
from one plasmid on a filter washed at 65*C hybridized 
strongly with the probe. 

DNA sequencing revealed that the "positive" clone, 
called 6C3, was a partial-length human serum albumin 

2 5 clone. Plasmid DNA was isolated from a culture of 6C3 

and digested with the restriction endonuclease Pstl. One 
of the resulting HSA cDNA fragments, about 475bp in 
length, was isolated and "nick translated" for use as a 

3 0 *50X^ Denhardt's stock » 1% polyvinylpyrrolidone/1 % 

ficoll/1% bovine serum albumin. 

IXSSC = 150mM NaCl/15mH sodium citrate, pH 6.8 with 
citric acid 
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probe. The ^n^,ire bank of approximately 2500 clones was 
screened wit* whis probe using a modification of the 
hybridizat:i^.i procedure of Grunstein et al,, supra . 

The .^ansformant colonies were individually picked 
5 from t:h; elates into separate wells in 96-well microtiter 
platea containing Luria broth plus 0.2% glucose plus 
25vq/tQl tetracycline and incubated at 37'C ovenight . 
Using a transfer device with 48 metal prongs, samples of 
each culture were transferred to two Iiuria 

10 broth/agar/tetracycline plates, one plate previously 

overlaid with a nitrocellulose filter, and incubated at 
37'C 2 days. The filters were then placed successively 
on Whatman filter paper soaked in one of the following 
solutions: 0.5M NaOH; IMTris, pB7.4; 1M Tris, pH7.4; 

15 2KS^ 90% ethanol, and 90% ethanol (in that order, 7 

minutes per solution.) The nitrocellulose filters were 
then baked im vacuo at 80 'C for 2 hours. 

Prehybridization and hybridization procedures were 
as described above, except that the three washes were at 

2 0 room temperature. 90 positive hybridization signals were 
detected by autoradiography. Some of the "positive 
clones" were further analyzed by restriction analysis 
(e.g. Pst I digestion) and hybridization of "Southern 
blots" as above. 

2 5 A clone bearing a full length HSA cDNA was 

identified and confirmed by DNA sequencing. The 
recombinant plasmid containing this HSA cDNA insert was 
termed pGX401 and is shown in figure 4. A partial 
restriction map of the HSA cDNA is shown in Figure 1, 

30 while Figure 2 shows the DNA sequence (5'+3' strand) of 
the cloned gene and the amino acid sequence it 
specifies . 

A sample of coli HB101 transformed with pGX401 
has been deposited at the U.S. Dept. of Agriculture 
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Northern Regional Research Center in Peoria, Illinois. 

under accession number NRRL B-15784. 

EXAMPLE IV 

DNA Sequence Analysis of HSA cDNR Prepared 
from Humy Liver Samples Taken from 
Different Individuals 

In comparing the DNA sequence of the HSA cDNA 
insert in p6X401 (Example III) with the cDNA sequences 
published by Lawn et al., supra , and Dugaiczyk et al« , 
supra , two codon differences were found that predict 
euaino acid differences. The pGX401 sequence and the 
sequence reported by Lawn et al. indicated that codon 97 
of the mature protein was GAG (glutamic acid), while 
Dugaiczyk et al. reported it to be GGG (glycine). In the 
pGX401 sequence and the sequence reported by Dugaiczyk 
codon 396 also was reported to be GAG (glutamic acid), 
and Lawn et 2l1. reported that codon to be AAG (lysine). 

To gain some insight into whether these differences 
represented true protein polymorphisms or merely 
experimental artifacts, the DNA sequence in the regions 
of codons 97 and 396 was determined for several new 
independent HSA genes. 

Messenger RNA (mRNA) was isolated from normal human 
liver tissue taken from four different individueils . The 
procedures of Example I were followed except that sucrose 
gradient fractionation of oligo (dT) -cellulose-purified 
mRNA was omitted. Double stranded cDNA was synthesized 
from this mRNA template by the procedures described in 
Example 11 and poly(dC) "tails" were added according to 
Deng and Wu, NAR 9:4123, 1981. 

The vector into which the dC-tailed cDNA was 
inserted was plasmid pGX1066. This plasmid canprises the 
phage X tR^ transcription terminator upstream of a bank of 
ten closely-spaced unique restriction sites, which in 
turn is upstream of the X4S transcription terminator. 
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coli strain GX1170 {P- leu hsdR thi sugB aal-1,2 lac 
Ul ara trgCSSSO laciqj transfocraed with pGX1066 has 
been deposited with the American Type Culture Collection, 
Rockville, Maryland, as ATCC No. 39955. 

Plasmid pGX1066 was linearized with PstI and 
poly(dG) tails were added using the homopolyineric tailing 
method described by Deng and Wu ( Nucleic Acids Res. . 9t 
4173 (1981)). The vector DMA and cDNA were then annealed 
as described in Example II. B. coli strain DH1 cells 

endAl, hsdRI? (Rj^-, M^-), supE44. thil , X", 
recAl, 32^^A96, relAl] were made competent and transformed 
with the annealing reaction mix. Both coli strain DHI 
and the transformation procedure used are described by D. 
Hanahan (J. Molec. Biol. , 166: 557 (1983)). Transfor- 
raants were plated on LM plates (1% (w/v) Bacto tryptone , 
0.5% (w/v) yeast extract, lOmM NaCl, lOmM MgSO^«7H2 0, 
1.5% (w/v) Bacto agar) with 35wg/ml ampicillin added. 

Transformed coli colonies were screened for the 
presence of HSA sequences by Grunstein-Bogness filter 
hybridization (Gergen et al., 1979, Hue. Acids. Res. 
7:2115; Wallace et al., 1981, Nuc. Acids Res. 9; 879) 
using kinased oligomers or nick-translated BSA cDNA 
fragments as probes. For identification of clones 
carrying HSA cDNA containing codon 396, a synthetic 
oligonucleotide, 5' TTGTACTCTCCAAGCTGC 3', corresponding 
to codons 397-402 (and the last nucleotide of codon 396) 
was used. For detection of clones carrying HSA cDNA 
containing codon 97, either of two synthetic 
oligonucleotides, 5' TCTCTTCATTGTCATGAAAAGC 3', 
corresponding to codons 126-132 (and one nucleotide of 
codon 133), or 5' TTCTTGTTTTGCACAGC 3', corresponding to 
codons 90 (last 2 nucleotides) - 95, or a nick-translated 
HSA fragment (derived from pGX401), corresponding to 
codons -1 to 364 was used. Opon identification of clones 
containing the HSA sequence of interest, restriction 
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fragments were subcloned into an Ml 3 phage. HSA cDNA- 
carrying phage were identified by screening plaques 
according to the procedure of Benton and Davis ( Science , 
196:180 (1977)). The DNA sequence was determined with 
these M13 clones by the dideoxy method (Biggin et al., 
Proc, Nat. Acad. Sci, r U.S.A. 80:3963 (1983)). 

By the procedures described above r transf ormants 
containing HSA cDNA that included codon 396 were derived 
from all four human livers. Transf ormants containing HSA 
cDNA that included codon 97 were derived from only two of 
the four livers. The DNA sequence in all cases 
(including 60 to 100 base pairs on each side of the codon 
in question) matched the sequence determined for pGX401. 

Messenger RNA then was isolated from normal human 
liver samples taken from two more individuals, and the 
sequence at codon 97 was determined using a modification 
of the Sanger sequencing procedure in which reverse 
transcriptase was used to copy the single-stranded RNA 
template. A synthetic oligonucleotide , 
5' TGTCTCTTCATTGTCATGAAAAGC 3', corresponding to codon s 
12 6-13 3 r was used as a primer. The mRNA, purified by 
oligo (dT) -cellulose chromatography as previously 
described, was incubated in a reaction volume of 2ul 
containing 10 mM Tris • HCl (pH 8.3), 140 mM KCa., 10 niM 
MgCljf 20 mM e-mercaptoethanol, 1.6 mM dNTP, 0.2 mM 
ddNTP, 250 ng RNA, 5 ng Icinased primer and 1.88 units 
reverse transcriptase (Life Sciences, Inc.). After 
overlaying the solution with 4 vl of mineral oil the 
reaction was incubated at 42*C for fifteen minutes and 
was terminated by the addition of 7 vl of 250 mM Naj 
EDTA. The mineral oil was extracted with ether and 
removed with a drawn-out pasteur pipette. Porraamide 
loading buffer was added to the samples prior to electro- 
phoresis on a urea sequencing gel. The gels were run 
until the bromphenol blue tracking dye had migrated to 
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the bottom. They then were dried under vacuum and 
exposed to X-ray £ilro with two intensifying screens for 
periods between twelve hours and several days. 

The HSA sequence at codon 97 for both liver samples 
was identical to the sequence at codon 97 in p6X401. 
(See Figure 5.) The reliability of the technique to 
determine nucleotide sequence from mRNA was evaluated 
using polyA"*" RNA prepared from the liver that was the 
source of the cDNA originally cloned in pGX401. The 
results (Figure 5) showed that the sequence determined in 
this manner was identical to the sequence originally 
determined in pGX401« 
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CLAIMS FOR THE DESIGNATED STATES: BE, DE, FR, IT, 
LU, NL, SE, CH and UK 

1. A synthetic gene coding for human serum albumin. 

2. An isolated human serum albumin gene, 

3. An isolated prepro-human serum albumin gene- 



4 . A human serum albumin gene as claimed in claim 1 , 
comprising the following deoxyribonucleotide sequence 
which corresponds to the indicated emiino acid sequence: 



Asp 
G A ^ 


Ala 

G C X 


His 
CAY 


Lys 
A A H 


Ser 
Q R S 


Glu 
GAM 


Val 
G T X 


Ala 
G C X 


His 
CAY 


Arg 
L G N 


Phe 
T T Y 


Lys 
A A M 


Asp 
GAY 


Leu 
Y T Z 


Gly 
G G X 


Glu 
GAM 


Glu 
GAM 


Asn 
A A Y 


Phe 
T T Y 


Lys 
A A H 


Ala 
G C X 


Leu 
Y T Z 


Val 
G T X 


Leu 
Y T Z 


He 
A T H 


Ala 
G C X 


Phe 
T T Y 


Ala 
G C X 


Gin 
CAM 


Tyr 
T A Y 


Leu 
Y T Z 


Gin 
CAM 


Gin 
CAM 


Cys 
T G Y 


Pro 
C C X 


Phe 
T T Y 


Glu 
GAM 


Asp 
GAY 


His 
CAY 


Val 
G T X 


Lys 
A A M 


Leu 
Y T Z 


Val 
G T X 


Asn 
A A Y 


Glu 
GAM 


Val 
G T X 


Thr 
A C X 


Glu 
GAM 


Phe 
T T Y 


Ala 
G C X 


Lys 
A A M 


Thr 

A C X 


Cys 
T G Y 


Val 
G T X 


Ala 

G C X 


Asp 
GAY 


Glu 
GAM 


Ser 
Q R S 


Ala 
G C X 


Glu 
GAM 


Asn 
A A Y 


Cys 
T G Y 


Asp 
GAY 


Lys 
A A H 


Ser 
Q R S 


Leu 
Y T Z 


Bis 
CAY 


Thr 
A C X 


Leu 
Y T Z 


Phe 
T T Y 


Gly 
G G X 


Asp 
GAY 


Lys 
A A H 


Leu 
Y -T Z 


Cys 
T G Y 


Thr 
A C X 


Val 
G T X 


Ala 
G C X 


Thr 
A C X 


Leu 
Y T Z 


Arg 
L G N 


Glu 
GAM 


Thr 
A C X 


Tyr 
T A Y 


Gly 
G G X 


Glu 
GAM 


Met 
A T G 


Ala 
G C X 


Asp 
GAY 


Cys 
T G Y 


Cys 
T G Y 


Ala 
G C X 


Lys 
A A M 


Gin 
CAM 


Glu 
GAM 


Pro 
C C X 
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GLu 
GAM 

His 
CAY 

Pro 
OCX 

Asp 
GAY 

Asp 
GAY 

Lys 
A A M 

Arg 
L G N 

GIu 
GAM 

Tyr 
T A Y 

Cys 
T G Y 

Cys 
T G Y 

Leu 
Y T Z 

Ser 
Q R S 

Ala 
G C X 

Arg 
£> G N 

Ala 
G C X 

Lys 
A A N 

Lys 
A A M 

• Val . 
G T X 



Arg 
L G N 

Lys 
A A M 

Arg 
L G N 

Val 
G T X 

Asn 
A A Y 

Tyr 
T A Y 

His 
CAY 

Leu 

Y T Z 

Lys 
A A M 

Ala 
G C X 

Leu 

Y T Z 

Arg 
L G N 

Ala 
G C X 

Ser 
Q R S 

Ala 
G C X 

Arg 
L G N 

Ala 
G C X 

Phe 
T T Y 

His 
CAY 



Asn 
A A Y 

Asp 
GAY 

Leu 
Y T Z 



Met 
A T G 

Glu 
GAM 

Leu 

Y T Z 

Pro 
C C X 

Leu 

Y T Z 

Ala 
G C X 

Gin 
CAM 

Phe 
T T Y 

Asp 
GAY 

Lys 
A A M 

Leu 

Y T Z 

Phe 
T T Y 

Leu 

Y T Z 

Glu 
GAM 

Val 
G T X 

Thr 
A C X 



Glu 
GAM 

Asp 
GAY 

Val 
G T X 

Cys 
T G Y 

Glu 
GAM 

Tyr 
T A Y 

Tyr 
T A Y 

Phe 
T T Y 

Ala 
G C X 

Ala 
G C X 

Pro 
C C X 

Glu 
GAM 

Gin 
CAM 

Gin 
CAM 

Lys 
A A M 

Ser 
Q R S 

Phe 
T T Y 

Thr 
A C X 

Glu 
GAM 



Cys 
T G Y 

Asn 
A A Y 

Arg 
L G K 

Thr 
A C X 

Thr 
A C X 

Glu 
GAM 

Phe 
T T Y 

Phe 
T T Y 

Phe 
T T Y 

Asp 
GAY 

Lys 
A A M 

Gly 
G G X 

Arg 
L G N 

Lys 
A A M 

Ala 
G C X 

Gin 
CAM 

Ala 
G C X 

Asp 
GAY 

Cys 
T G Y 



Phe 
T T Y 

Pro 
C C X 

Pro 
C C X 

Ala 
G C X 

Phe 
T T Y 

lie 
A T H 

Thr 
A C X 

Ala 
G C X 

Thr 
A C X 

Lys 
A A M 

Leu 

Y T Z 

Lys 
A A M 

Leu 

Y T Z 

Phe 
T T Y 

Trp 
T G G 

Arg 
L G N 

Glu 
GAM 

Leu 

Y T Z 

Cys 
T G Y 



Leu 


Gin 


Y T Z 


CAM 


Asn 


Leu 


A A Y 


Y T Z 


Glu 


Val 


GAM 


G T X 


Phe 


His 


T T Y 


CAY 


Leo 


Lys 


Y T Z 


A A M 


Ala 


Arg 


G C X 


L G N 


Ala 


Pro 


G C X 


C C X 


Lys 


Arg 


A A M 


L G N 


Glu 


Cys 


GAM 


T G Y 


Ala 


Ala 


G C X 


G C X 


Asp 


Glu 


GAY 


GAM 


Ala 


Ser 


G C X 


Q R S 


Lys 


Cys 


A A M 


T G Y 


Gly 


Glu 


G G X 


GAM 


Ala 


Val 


G C X 


G T X 






T T Y 


C C X 


Val 


Ser 


G T X 


Q R S 


Thr 


Lys 


A C X 


A A M 


His 


Gly 


CAY 


G G X 



Asp 


Leu 


GAY 


Y T Z 


Arg 


Ala 


r. n M 

u w n 


n p V 


Cys 


Glu 


rn V 

T G Y 


GAM 


Ser 


Lys 


Q R S 


A A M 


Lys 


Pro 


A A H 


V 

CCA 


Cys 


lie 


T f2 V 


Ik T* R 
rl X u 


GIu 


Met 


U A PI 


A X i> 


Phe 


Ala 


rn m V 
X T X 


G C X 


Lys 


Asp 


A A M 


GAY 


Glu 


Ala 


GAM 


G C X 


Net 


Phe 


A T G 


T T Y 


Arg 


His 


L G N 


CAY 


Leu 


Leu 


Y T Z 


Y T Z 


Tyr 


Glu 


T A Y 


GAM 


Cys 


Ala 


T G Y 


G C X 


Cys 


Tyr 


T G Y 


T A Y 


Phe 


Lys 


T T Y 


A A M 
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Leu Glu 

Y T Z GAM 

Asp Leu 

GAY Y T Z 

Asn Gin 

A A Y CAM 



Leu Lys 

Y T Z A A M 

Leu Phe 

Y T Z T T Y 

Ala Glu 

G C X GAM 

Pro Ala 

C C X G C X 

Val Asp 

G T X GAY 



Val Cys 

G T X T G Y 

Lys Asp 

A A M GAY 

Phe Tyr 

T T Y T A Y 

Pro Asp 

C C X GAY 

Leu Arg 

Y T Z L G N 

Thr Thr 

A C X A C X 

Ala Ala 

G C X G C X 

Ala Lys 

G C X A A M 



Pro Pro 
c c X c c X 



Cys 


Ala 


T G Y 


G C X 

w w A 


Ala 


Lys 


n V 

Is C A 


& & u 
A A n 


Asp 


Ser 


GAY 


Q R S 


Glu 


Cys 


GAM 


T G Y 


Glu 


Lys 


G A H 


A A H 


val 


Glu 


rn V 


\a A n 


Asp 


Phe 




m in V 
1 X X 


Phe 


Val 


T T Y 


G T X 


Lys 


Asn 


A A M 


A A Y 


Val 


Phe 


G T X 


T T Y 


Glu 


Tyr 


GAM 


T A Y 


Tyr 


Ser 


T A Y 


Q R S 


Leu 


Ala 


Y T Z 


G C X 


Leu 


Glu 


Y T Z 


GAM 


Asp 


Pro 


GAY 


C C X 


Val 


Phe 


G T X 


T T Y 


Val 


Glu 


G T X 


GAM 
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Asp 


Asp 


r2 & V 

VJ A Z 


V3 A X 


Tyr 


He 


TAX 


A T a 


He 


Ser 


A T H 


Q R S 


Cys 


Glu 


T G Y 


GAM 


Ser 


His 


Q R S 


CAY 


Asn 


Asp 


A A X 


GAY 


Pro 


Ser 


C C X 


Q R S 


Glu 


Ser 


GAM 


Q R S 


Tyr 


Ala 


T A Y 


G C X 


Leu 


Gly 


Y T Z 


G G X 


Ala 


Arg 


G C X 


L G N 


Val 


Val 


G T X 


G T X 


Lys 


Thr 


A A M 


A C X 


Lvs 


Cvs 


A A M 


T G Y 


His 


Glu 


CAY 


GAM 


Asp 


Glu 


GAY 


GAM 


Glu 


Pro 


GAM 


C C X 



Gin 
CAM 


Asn 
A A Y 


Glu 
GAM 


Leu 
Y T 3 


Tyr 
T A Y 


Lys 
A A M 


Val 
G T X 


Arg 
L G N 


Gin 
CAM 


Leu 
Y T Z 


Glu 
GAM 


Val 
G T X 


Val 
G T X 


Gly 
G G X 


Pro 

C C Y 

\* ^ A 


Glu 

(I A hi 
V9 n n 


Ala 
G C X 


Glu 

GAM 
w n n 


Leu 
Y T Z 

A X u 


Asn 

AAV 


Glu 
o A n 


Lys 
A A n 


Val 
^2 »n Y 

u 1 A 


Thr 

ft ^ V 

A u A 


Leu 

i X £i 


Val 

m Y 


Ser 
n n Q 

U C\ D 


Ala 

o r» V 
o u A 


Tyr 
T A Y 


Val 
G T X 


Glu 
GAM 


Thr 
A C X 


He 
A T H 


Cys 
T G Y 
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Phe 
T T Y 


He 
A T H 


Phe 
T T Y 


Glu 
GAM 


Phe 
T T Y 


Gin 
CAM 


Tyr 
T A Y 


Thr 
A C X 


Ser 
Q R S 


Thr 
A C X 


Ser 

Q R S 


Arq 
L G N 


3 


Lvs 
A A M 


Aia 
G C X 


nys 
A A M 


Asp 
GAY 


Tvr 
T A Y 


Gin 
CAM 


Leu 
Y T Z 


Thr 
A C X 


Pro 
C C X 


A A M 


T G Y 


Asn 
A A Y 


L G N 


Leu 
Y T Z 


GAM 


Pro 
C C X 


A A M 


Phe 
T T Y 


Thr 
A C X 


Thr 
A C X 


Leu 
Y T Z 



Lys 


Gin 


A A M 


CAM 


Gin 


Leu 


CAM 
V* A PI 


X i 6 


Asn 


Ala 


A A X 


r* v 

v> C A 


Lys 


Lys 


A A n 


A A n 


Pro 


Thr 


r r Y 


A P V 
A \« A 


Asn 


Leu 


A A Z 


X T Z 


Cys 


Cys 


T V 


m o V 

r G X 


Arg 


Met 


L G N 


A T G 


Leu 


Ser 


X 1 a 


W H S 


Cys 


Val 


n* fi V 
1 u X 


p m V 
u i A 


Val 


Ser 


G T X 


Q R S 


Cys 


Thr 


T G Y 


A C X 


Arg 


Pro 


L G N 


C C X 


Val 


Asp 


G T X 


GAY 


Glu 


Phe 


GAM 


T T Y 


Phe 


His 


T T Y 


CAY 


Ser 


Glu 


Q R S 


GAM 
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Asn 


Cys 


AAV 
A A Z 


»n /*» V 


Gly 


Glu 


G G X 


GAM 


Leu 


Phe 


Y T Z 


T T Y 


Val 


Pro 


G T X 


C C X 


Leu 


Val 


X r z 


G T X 


Gly 


Lys 


G G X 


A A M 


Lys 


His 


A A M 


CAY 


Pro 


Cys 


C C X 


T G Y 


Val 


Val 


G T X 


G T X 


Leu 


His 


Y T Z 


CAY 


Asp 


Arg 


GAY 


L G N 


Glu 


Ser 


GAM 


Q R S 


Gly 


Phe 


G G X 


T T Y 


Glu 


Thr 


GAM 


A C X 


Asn 


Ala 


A A Y 


G C X 


Ala 


Asp 


G C X 


GAY 


Lys 


Glu 


A A M 


GAM 
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Arg 


Gin 


He 


Lys 


Lys 


Glu 


Thr 


Ala 


Jj u M 


u A n 


A 1 n 


A A n 


A A n 


P A If 

A n 


A P V 

A Lr A 


G C X 


Leu 


Val 


Gla 


Leu 


Val 


Lys 


His 


Lys 


X 1 ^ 


T Y 


u A n 


X 1 a 


r« qi V 


A A M 

A A n 


P A V 

U A X 


A K U 

A A H 


Pro 


Lys 


Ala 


Thr 


Lys 


Glu 


Glu 


Leu 


n V 
V« U A 


A & M 

A A Sfl 


P V 
U U A 


A U A 


A A n 


V9 A n 


<i A H 


X T Z 


Lys 


Ala 


Val 


Met 


Asp 


Asp 


Phe 


Ala 


A A n 


G C X 


m vr 

G T X 


A T G 


G A X 


GAY 


fii m V 

T T Y 


G C X 


Ala 


Phe 


Val 


Glu 


Lys 


Cys 


Cys 


Lys 


G C X 


T T y 


G T X 


GAM 


A A M 


T G Y 


T G Y 


A A M 


Ala 


Asp 


Asp 


Lys 


Glu 


Thr 


Cys 


Phe 


G C X 


GAY 


GAY 


A A M 


GAM 


A C X 


T G Y 


T T Y 


Ala 


Glu 


Glu 


Gly 


Lys 


Lys 


Leu 


Val 


G C X 


GAM 


GAM 


G G X 


A A M 


A A M 


Y T Z 


G T X 


Ala 


Ala 


Ser 


Glu 


Ala 


Val 


Leu 


Gly 


G C X 


G C X 


Q R S 


GAM 


G C X 


G T X 


Y T Z 


G G X 



Leu 

Y T Z T A A 

wherein, the 5* to 3' strand, beginning with the amino 
terminus and the amino acids £or which each triplet codes 
are shown, and wherein the abbreviations have the 
following standard meanings: 

A is deoxyadenyl 

T is thymidyl 

G is deoxyguanyl 

C is deoxycytosyl 

X is A, T, C or G 

Y is T or C 

When Y is C, Z is A, T, C or G 
When Y is T, Z is A or G 
e is A, T or C 
Q is T or A 

When Q is T, R is C and S is A, T, C or G 
When Q is A, R is G and S is T or C 
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M is A or G 
li is A or C 

When L is A, N is A or G 

When L is C, N is A, T, C or G 

GLY is glycine 

ALA is alanine 

VAL is valine 

LEO is leucine 

I LB is isoleucine 

SER is serine 

THR is threonine 

PHB is phenylalanine 

TYR is tyrosine 

TRP is tyryptophan 

CYS is cysteine 

MET is methionine 

ASP is aspartic acid 

GLU is glutamic acid 

LYS is lysine 

ARG is arginine 

HIS is histidine 

PRO is proline 

GLN is glutamine 

ASM is asparagine 

5. A prepro-serum albumin gene as claimed in claim 1 
comprising the following deoxyribonucleotide sequence: 





Met 


Lys 


Trp 


Val 


Thr 


Phe 




A T G 


A A M 


T G G 


G T X 


A C X 


T T Y 


He 


Ser 


Leu 


Leu 


Phe 


Leu 


Phe 


A T H 


Q R S 


Y T 2 


Y T Z 


T T Y 


Y T Z 


T T Y 


Ser 


Ser 


Ala 


Tyr 


Ser 


Arg 


Gly 


Q R S 


Q R S 


G C X 


T A Y 


Q R S 


L G N 


G G X 



Val Phe 
G T X T T Y 



Arg Arg Asp Ala His Lys 
LGM LGN GAY GCX CAY AAM 



Ser 


GIu 


Q R S 


GAM 


Asp 


Leu 


GAY 


Y T Z 


Ala 


Leu 


G C X 


Y T Z 


Gin 


Tyr 


CAM 


T A Y 


Glu 


Asp 


GAM 


GAY 


Gla 


Val 


GAM 


G T X 


Cys 


Val 


T G Y 


G T X 


Asn 


Cys- 


A A Y 


T G Y 


Leu 


Phe 


Y T Z 


T T Y 


Val 


Ala 


G T X 


G C X 


Gly 


Glu 


G G X 


GAM 


Lys 


Gin 


A A M 


CAM 


Cys 


Phe 


T G Y 


T T Y 


Asn 


Pro 


A A Y 


C C X 


Arg 


Pro 


L G N 


C C X 


Thr 


Ala 


A C X 


G C X 


Thr 


Phe 


A C X 


T T Y 
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Val 


Ala 


G T X 


G C X 


Glv 


Glu 


G G X 


GAM 


Val 


Leu 


G T X 


Y T Z 


Leu 


Gin 


Y T Z 


CAM 


His 


Val 


CAY 


G T X 


Thr 


Glu 


A C X 


GAM 


Ala 


ASD 


G C X 


GAY 


ASD 


LVS 


GAY 


A A M 


Glv 


Asp 


G G X 


GAY 


Thr 


Leu 


A C X 


Y T Z 


Me^ 


Ala 


A T G 


G C X 


Glu 


Pro 


GAM 


C C X 


Xieu 


Gin 


Y T 2 


CAM 


Asn 


Leu 


A A Y 


Y T Z 


Glu 


Val 


GAM 


G T X 


Phe 


His 


T T Y 


CAY 


Leu 


Lys 


Y T Z 


A A M 



flis Arg 
CAY L G N 



Glu Asn 

GAM A A Y 

lie Ala 

A T H G C X 

Gin Cys 

CAM T G Y 

Lys Leu 

A A M Y T Z 

Phe Ala 

T T Y G C X 

Glu Ser 

GAM Q R S 

Ser Leu 

Q R S Y T Z 

Lys Leu 

A A M Y T 2 

Arg Glu 

L G N GAM 

Asp Cys 

GAY T G Y 

Glu Arg 

GAM L G N 

His Lys 

CAY A A M 

Pro Arg 

C C X L G N 

ASD Val 

GAY G T X 

Asp Asn 

GAY A A Y 

Lys Tyr 

A A M T A Y 
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Phe Lys 

T T Y A A M 

Phe Lys 

T T Y A A M 

Phe Ala 

T T Y G C X 

Pro Phe 

C C X T T Y 

Val Asn 

G T X A A Y 

Lys Thr 

A A H A C X 

Ala Glu 

G C X GAM 

His Thr 

CAY A C X 

Cys Thr 

T G Y A C X 

Thr Tyr 

A C X T A Y 

Cys Ala 

T G Y G C X 

Asn Glu 

A A Y GAM 

Asp Asp 

GAY GAY 

Leu Val 

Y T Z G T X 

Met Cys 

A T G T G Y 

Glu Glu 

GAM GAM 

Leu Tyr 

Y T Z T A Y 
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Glu 


He 


G k H 
vj n n 


A T n 


Phe 


Thr 


T T V 


A \« A 


Phe 


Ala 


(n m V 


V 

G C X 


Phe 


Thr 


T T Y 


A C X 


Asp 


Lys 


GAY 


A A M 


Lys 


Leu 


A A M 


Y T Z 


Gly 


Lys 


V 

V9 U A 


A A n 


Arg 


Leu 


ii u N 


V ill 9 

I T Z 


Lys 


Phe 


A A M 


T T Y 


Ala 


Trp 


G C X 


T G G 


Gin 


Arg 


CAM 


L G N 


Ala 


Glu 


G C X 


GAM 


Asp 


Leu 


GAY 


Y T 2 


Cys 


Cys 


T G Y 


T G Y 


Cys 


Ala 


T G Y 


G C X 


Ala 


Lys 


G C X 


A A M 


Asp 


Ser 


GAY 


Q R S 



Ala 


Arg 


G C X 


L G H 

U \J n 


Ala 


Pro 


r* r* V 
G L X 


V 

C C X 


Lys 


Arg 


A A M 


L G N 


Glu 


Cys 


GAM 


T G Y 


Ala 


Ala 


G C X 


G C X 


Asp 


Glu 


GAY 


GAM 


Ala 


Ser 


G C X 


Q R S 


Lys 


Cys 


A A M 


T G Y 


Gly 


Glu 


G G X 


GAM 


Ala 


Val 


G C X 


G T X 


Phe 


Pro 


T T Y 


C C X 


Val 


Ser 


G T X 


Q R S 


Thr 


Lys 


A C X 


A A M 


His 


Gly 


CAY 


G G X 


Asp 


Asp 


GAY 


GAY 


Tyr 


lie 


T A Y 


A T H 


He 


Ser 


A T H 


Q R S 



Arg His 
L G N CAY 



Glu Leu 
GAM Y T 2 



Tyr Lys 

T A Y A A M 

Cys Ala 

T G Y G C X 

Cys Leu 

T G Y Y T 2 

Leu Arg 

Y T Z L G M 

Ser Ala 

Q R S G C X 

Ala Ser 

G C X Q R S 

Arg Ala 

L G N G C X 

Ala Arg 

G C X L G N 



Lys Ala 

A A M G C X 

Lys Phe 

A A M T T Y 

Val His 

G T X CAY 

Asp Leu 

GAY Y T 2 

Arg Ala 

L G N G C X 

Cys Glu 

T G Y GAM 

Ser Lys 

Q R S A A M 



Pro Tyr 

C C X T A Y 

Leu Phe 

Y T 2 T T Y 

Ala Ala 

G C X G C X 

Gin Ala 

CAM G C X 

Phe Pro 

T T Y C C X 

Asp Glu 

GAY GAM 

Lys Gin 

A A M CAM 

Leu Gin 

Y T 2 CAM 

Phe Lys 

T T Y A A M 

Leu Ser 

Y T 2 Q R S 

Glu Phe 

GAM T T Y 

Val Thr 

G T X A C X 

Thr Glu 

A C X GAM 

Leu Glu 

Y T 2 GAM 

Asp Leu 

GAY Y T 2 

Asn Gin 

A A Y CAM 

Leu Lys 

Y T 2 A A M 
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GIu 
GAM 

GIu 
GAM 

Val 
G T X 

Asp 
GAY 

Phe 
T T Y 

Lys 
A A M 

Val 
G T X 

GIu 
GAM 

Tyr 
T A Y 

Leu 

Y T Z 

Leu 

Y T Z 

Asp 
GAY 

Val 
G T X 

Val 
G T X 

Lys 
A A M 

Gin 
CAM 

Asn 
A A Y 



Cys 
T G Y 

Lys 
A A M 

GIu 
GAM 

Phe 
T T Y 

Val 
G T X 

Asn 
A A Y 

Phe 
T T Y 

Tyr 
T A y 

Ser 
Q R S 

Ala 
G C X 

GIu 
GAM 

Pro 

C C X 

Phe 
T T Y 

GIu 
GAM 

Gin 
CAM 

Leu 
Y T Z 

Ala 
G C X 



Cys 
T G Y 

Ser 
Q R S 

Asn 
A A Y 

Pro 
C C X 

GIu 
GAM 

Tyr 
T A Y 

Leu 
Y T Z 

Ala 
G C X 

Val 
G T X 



Lys 
A A M 



Lys 
A A M 

Ris 
CAY 

Asp 
GAY 

GIu 
GAM 

Asn 
A A Y 

Gly 
G G X 

Leu 
Y T Z 
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GIu 
GAM 

His 
CAY 



Asp 
GAY 

Ser 
Q R S 

Ser 
Q R S 

Ala 
6 C X 

Gly 
G G X 

Arg 
L G M 

Val 
G T X 

Thr 
A C X 

Cys 
T G Y 

Glu 
GAM 

Glu 
GAM 

Pro 
C C X 

Cys 
T G Y 

Glu 
GAM 

Phe 
T T Y 



Lys 
A A M 



Cys 
T G Y 

Glu 
GAM 

Phe 
T T Y 

Lys 
A A M 

Glu 
GAM 

Met 
A T G 

Arg 
L G N 

Leu 
Y T Z 

Tyr 
T A Y 

Cys 
T G Y 

Cys 
T G Y 

Phe 
T T Y 

Gin 
CAM 

Glu 
GAM 

Tyr 
T A Y 

Val 
G T X 



Pro 
C C X 

lie 
A T a 

Met 
A T G 

Ala 
G C X 

Asp 
GAY 

Ala 
G C X 

Phe 
T T Y 

His 
CAY 

Leu 

Y T Z 

Glu 
GAM 

Ala 
G C X 

Tyr 
T A Y 

Lys 
A A M 

Asn 
A A Y 

Leu 

Y T Z 



Lys 
A A M 



Arg 
L G N 



Leu 


Phe 


X X *l 


fp m Y 


Ala 


Glu 


fZ C Y 
Vj N« A 


u A n 


Pro 


Ala 


r* p Y 


fZ V 
^ A 


Val 


Asp 


m V 


Vj A X 


Val 


Cys 


/2 "P V 


m /2 V 
X U X 


Lys 


Asp 


A A n 


la A X 


Phe 


Tyr 


m m V 
T T X 


fit A V 

T A X 


Pro 


Asp 


C C X 


GAY 


Leu 


Arg 


X X «i 


Lt \J c% 


Thr 


Thr 


A U A 


A U A 


Ala 


Ala 


V 

is U A 


V 

U C A 


Ala 


Lys 


O V 

(i U A 


A A n 


Pro 


Pro 


^ V 
U U A 


r* r* v 

C W A 


Phe 


He 


m m V 
1 1 X 


n X Q 


Phe 


Glu 


T T Y 


GAM 


Phe 


Gin 


T T Y 


CAM 


Tyr 


Thr 


T A Y 


A C X 



Lys Lys 
A A H A A M 



Pro Thr 
C C X A C X 



Asn Leu 

A A Y Y T Z 

Cys Cys 

T G Y T G Y 



Arg Met 

L G N A T G 

Leu Set 

Y T Z Q R S 

Cys Val 

T G Y G T X 

Val Ser 

G T X Q R S 

Cys Thr 

T G Y A C X 



Arg Pro 
L G N C C X 



Val Asp 

G T X GAY 

Glu Phe 

GAM T T Y 

Phe His 

T T Y CAY 

Ser Glu 

Q R S GAM 

Lys Glu 

A A M GAM 

Val Lys 

G T X A A M 

Lys Glu 

A A M GAM 

Asp Asp 

GAY GAY 
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Val Pro 

G T X C C X 

Leu Val 

Y T Z G T X 

Gly Lys 

G G X A A M 

Lys His 

A A M CAY 

Pro Cys 

C C X T G y 

Val Val 

G T X G T X 

Leu His 

Y T Z CAY 

Asp Arg 

GAY L G M 

Glu Ser 

GAM Q R S 

Gly Phe 

G G X T T Y 



Glu Thr 

GAM A C X 

Asn Ala 

A A Y G C X 



Ala Asp 

G C X GAY 

Lys Glu 

A A M GAM 

Thr Ala 

A C X G C X 

His Lys 

CAY A A M 

Glu Leu 

GAM Y T Z 

Phe Ala 

T T Y G C X 



Gin Leu 

CAM Y T Z 

Glu Val 

GAM G T X 

Val Gly 

G T X G G X 

Pro Glu 

C C X GAM 

Ala Glu 

G C X GAM 

Leu Asn 

Y T 2 A A Y 

Glu Lys 

GAM A A M 

Val Thr 

G T X A C X 

Leu Val 

Y T Z G T X 

Ser Ala 

Q R S G C X 

Tyr Val 

T A Y G T X 

Glu Thr 

GAM A C X 

He Cys 

A T H T G Y 

Arg Gin 

L G N CAM 

Leu Val 

Y T Z G T X 

Pro Lys 

C C X A A M 

Lys Ala 

A A M G C X 

Ala Phe 

G C X T T Y 
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Ser Thr 

Q R S A C X 

Ser Arg 

Q R S L G N 

Se r Lys 

Q R S A A M 

Ala Lys 

G C X A A M 

Asp Tyr 

GAY T A Y 

Gin Leu 

CAM Y T 2 

Thr Pro 

A C X C C X 

Lys Cys 

A A M T G Y 

Asn Arg 

A A Y L G H 

Leu Glu 

Y T Z GAM 

Pro Lys 

C C X A A M 

Phe Thr 

T T Y A C X 

Thr Leu 

A C X Y T Z 

He Lys 

A T H A A N 

Glu Leu 

GAM Y T Z 

Ala Thr 

G C X A C X 

Val Met 

G T X A T G 

Val Glu 

G T X GAM 
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Lys 


Cys 




Lys 


t\JLcL 


Asp 


Asp 


ijys 


A A n 


rn ri V 

lux 


V 


A A W 


r n V 

U U A 


Vj A I 


V3 A Z 


HSU 

A A H 


Glu 


Thr 


Cys 


Phe 


Ala 


Glu 


Glu 


Gly 


GAM 


A C X 


T G Y 


T T Y 


G C X 


GAM 


GAM 


G G X 


Lys 


Lys 


Leu 


Val 


Ala 


Ala 


Set 


Glu 


A A M 


A A H 


Y T 2 


G T X 


G C X 


G C X 


Q R S 


GAM 


Ala 


val 


Leu 


Gly 


Leu 








G C X 


G T X 


y T z 


G G X 


Y T 2 


T A A 







wherein the 5' and 3' stremdr beginning with the amino 
terminus, and the amino acids for which each triplet 
codes are shown, and wherein the abbreviations are 
defined as in claim 4.. 

6. A hxaman serum albiimin gene as claimed in claim 4 
comprising the following deozyribonucleotide sequence : 
GAT GCA CAC AAG AGT GAG GOT GCT CAT CGG TTT AAA GAT TTG 
GGA GAA GAA AAT TTC AAA GCC TTG GTG TTG AOT GCC TTT GCT 
CAG TAT CTT CAG CAG TGT CCA TTT GAA GAT CAT GTA AAA OTA 
GTG AAT GAA GTA ACT GAA TOT GCA AAA ACA TGT GTT GCT GAT 
GAG TCA GCT GAA AAT TGT GAC AAA TCA CTT CAT ACC COT TOT 
GGA GAC AAA OTA TGC ACA GOT GCA ACT COT CGT GAA ACC TAT 
GGT GAA ATG GCT GAC TGC TGT GCA AAA CAA GAA CCT GAG AGA 
AAT GAA TGC TTC TTG CAA CAC AAA GAT GAC AAC CCA AAC CTC 
CCC CGA OTG GTG AGA CCA GAG GTT GAT GTG ATG TGC ACT GCT 
TOT CAT GAC AAT GAA GAG ACA TOT TTG AAA AAA TAC TTA TAT 
GAA AOT GCC AGA AGA CAT CCT TAC TTT TAT GCC CCG GAA CTC 
COT TTC TOT GCT AAA AGG TAT AAA GCT GCT TOT ACA GAA TGT 
TGC CAA GCT GCT GAT AAA GCT GCC TGC CTG TTG CCA AAG CTC 
GAT GAA COT CGG GAT GAA GGG AAG GCT TCG TCT GCC AAA CAG 
AGA CTC AAG TGT GCC AGT CTC CAA AAA OTT GGA GAA AGA GCT 
OTC AAA GCA TGG GCG GTG GCT CGC CTG AGC CAG AGA TOT CCC 
AAA GCT GAG TTT GCA GAA GOT TCC AAG TTA GTG ACA GAT COT 
ACC AAA GTC CAC ACG GAA TGC TGC CAT GGA GAT CTG COT GAA 
TGT GCT GAT GAC AGG GCG GAC COT GCC AAG TAT ATC TGT GAA 
AAT CAA GAT TCG ATC TCC AGT AAA CTG AAG GAA TGC TGT GAA 
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AAA 


CCT 


CTG 


TTG 


GAA 


AAA 


39 
TCC 


CAC 


TGC 


ATT 


GCC 


GAA 


GTG 


GAA 


AAT 


GAT 


GAG 


ATG 


CCT 


GCT 


GAC 


TTG 


CCT 


TCA 


TTA 


GCT 


GCT 


GAT 


TTT 


GTT 


GAA 


AGT 


AAG 


GAT 


GTT 


TGC 


AAA 


AAC 


TAT 


GCT 


GAG 


GCA 


AAG 


GAT 


GTC 


TTC 


CTG 


GGC 


ATG 


TTT 


TTG 


TAT 


GAA 


TAT 


GCA 


AGA 


AGG 


CAT 


CCT 


GAT 


TAC 


TCT 


GTC 


GTG 


CTG 


CTG 


CTG 


AGA 


CTT 


GCC 


AAG 


ACA 


TAT 


GAA 


ACC 


ACT 


CTA 


GAG 


AAG 


TGC 


TGT 


GCC 


GCT 


GCA 


GAT 


CCT 


CAT 


GAA 


TGC 


TAT 


GCC 


AAA 


GTG 


TTC 


GAT 


GAA 


TTT 


AAA 


CCT 


CCT 


GTG 


GAA 


GAG 


CCT 


CAG 


AAT 


TTA 


ATC 


AAA 


CAA 


AAT 


TGT 


GAG 


CTT 


TTT 


GAG 


CAG 


CTT 


GGA 


GAG 


TAC 


AAA 


TTC 


CAG 


AAT 


GCG 


CTA 


TTA 


GTT 


CGT 


TAC 


ACC 


AAG 


AAA 


GTA 


CCC 


CAA 


GTG 


TCA 


ACT 


CCA 


ACT 


CTT 


GTA 


GAG 


GTC 


TCA 


AGA 


AAC 


CTA 


GGA 


AAA 


GTG 


GGC 


AGC 


AAA 


TGT 


TGT 


AAA 


CAT 


CCT 


GAA 


GCA 


AAA 


AGA 


ATG 


CCC 


TGT 


GCA 


GAA 


GAC 


TAT 


CTA 


TCC 


GTG 


GTC 


CTG 


a; 


CAG 


TTA 


TGT 


GTG 


TTG 


CAT 


GAG 


AAA 


ACG 


CCA 


GTA 


AGT 


GAC 


AG.\ 


GTC 


ACC 


AAA 


TGC 


TGC 


ACA 


GAA 


TCC 


TTG 


GTG 


AAC 


AGG 


CGA 


CCA 


TGC 


TTT 


TCA 


GCT 


CTG 


GAA 


GTC 


GAT 


GAA 


ACA 


TAC 


GTT 


CCC 


AAA 


GAG 


TTT 


AAT 


GCT 


GAA 


ACA 


TTC 


Arc 


TTC 


CAT 


GCA 


GAT 


ATA 


TGC 


ACA 


CTT 


TCT 


GAG 


AAG 


GAG 


AGA 


CAA 


ATC 


AAG 


AAA 


CAA 


ACT 


GCA 


CTT 


GTT 


GAG 


CTC 


GTG 


AAA 


CAC 


AAG 


CCC 


AAG 


GCA 


ACA 


AAA 


GAG 


CAA 


CTG 


AAA 


GCT 


GTT 


ATG 


GAT 


GAT 


TTC 


GCA 


GCT 


TTT 


GTA 


GAG 


AAG 


TGC 


TGC 


AAG 


GCT 


GAC 


GAT 


AAG 


GAG 


ACC 


TGC 


TTT 


GCC 


GAG 


GAG 


GGT 


AAA 


AAA 


CTT 


GTT 


GCT 


GCA 


AGT 


CAA 


GCT 


GCC 


TTA 


GGC 


TTA 


TAA 







wherein the 5' to 3* strand, beginning with the amino 
terminus is shown, and wherein the abbreviations are 
defined as in claim 4. 



7 , A human prepro-serum albunin gene as claimed in claim 5 
comprising the . following deoxyribonucleotide sequences 
ATG AAG TGG GTA ACC TTT ATT TCC CTT CTT TTT CTC TOT AGC 
TCG GCT TAT TCC AGG GGT GTG TTT CGT CGA GAT GCA CAC AAG 
AGT GAG GTT GCT CAT COG TTT AAA GAT TTG GGA GAA GAA AAT 
TTC AAA GCC TTG GTG TTG ATT GCC TTT GCT CAG TAT CTT CAG 
CAG TGT CCA TTT GAA GAT CAT GTA AAA TTA GTG AAT GAA GTA 
ACT GAA TTT GCA AAA ACA TGT GTT GCT GAT GAG TCA GCT GAA 



40 
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AAT TGT GAC AAA TCA CTT CAT 
TGC ACA GTT GCA ACT CTT CGT 
GAC TGC TOT GCA AAA CAA GAA 
TTG CAA CAC AAA GAT GAC. AAC 
AG A CCA GAG GTT GAT GTG ATG 
GAA GAG ACA TTT TTG AAA AAA 
AGA CAT CCT TAC TTT TAT GCC 
AAA AGG TAT AAA GCT GCT TTT 
GAT AAA GCT GCC TGC CTG TTG 
GAT GAA GGG AAG GCT TCG TCT 
GCC AGT CTC CAA AAA TTT GGA 
GCG GTG GCT CGC CTG AGC CAG 
GCA GAA GTT TCC AAG TTA GTG 
ACG GAA TGC TGC CAT GGA GAT 
AGG GCG GAC CTT GCC AAG TAT 
ATC TCC AGT AAA CTG AAG GAA 
GAA AAA TCC CAC TGC ATT GCC 
CCT GCT GAC TTG CCT TCA TTA 
AAG GAT GTT TGC AAA AAC TAT 
CTG GGC ATG TTT TTG TAT GAA 
TAC TCT GTC GTG CTG CTG CTG 
ACC ACT CTA GAG AAG TGC TGT 
TGC TAT GCC AAA GTG TTC GAT 
GAG CCT CAG AAT TTA ATC AAA 
CAG CTT GGA GAG TAC AAA TTC 
TAC ACC AAG AAA GTA CCC CAA 
GAG GTC TCA AGA AAC CTA GGA 
AAA CAT CCT GAA GCA AAA AGA 
CTA TCC GTG GTC CTG AAC CAG 
ACG CCA GTA AGT GAC AGA GTC 
TTG GTG AAC AGG CGA CCA TGC 
GAA ACA TAC GTT CCC AAA GAG 
TTC CAT GCA GAT ATA TGC ACA 
ATC AAG AAA CAA ACT GCA CTT 
CCC AAG GCA ACA AAA GAG CAA 



ACC CTT TTT GGA GAC AAA TTA 
GAA ACC TAT GGT GAA ATG GCT 
CCT GAG AGA AAT GAA TGC TTC 
CCA AAC CTC CCC CGA TTG GTG 
TGC ACT GCT TTT CAT GAC AAT 
TAC TTA TAT GAA ATT GCC AGA 
CCG GAA CTC CTT TTC TTT GCT 
ACA GAA TGT TGC CAA GCT GCT 
CCA AAG CTC GAT GAA CTT CGG 
GCC AAA CAG AGA CTC AAG TGT 
GAA AGA GCT TTC AAA GCA TGG 
AGA TTT CCC AAA GCT GAG TTT 
ACA GAT CTT ACC AAA GTC CAC 
CTG CTT GAA TGT GCT GAT GAC 
ATC TGT GAA AAT CAA GAT TCG 
TGC TGT GAA AAA CCT CTG TTG 
GAA GTG GAA AAT GAT GAG ATG 
GCT GCT GAT TTT GTT GAA AGT 
GCT GAG GCA AAG GAT GTC TTC 
TAT GCA AGA AGG CAT CCT GAT 
AGA CTT GCC AAG ACA TAT GAA 
GCC GCT GCA GAT CCT CAT GAA 
GAA TTT AAA CCT CCT GTG GAA 
CAA AAT TGT GAG CTT TTT GAG 
CAG AAT GCG CTA TTA GTT CGT 
GTG TCA ACT CCA ACT CTT GTA 
AAA GTG GGC AGC AAA TGT TGT 
ATG CCC TGT GCA GAA GAC TAT 
TTA TGT GTG TTG CAT GAG AAA 
ACC AAA TGC TGC ACA GAA TCC 
TTT TCA GCT CTG GAA GTC GAT 
TTT AAT GCT GAA ACA TTC ACC 
CTT TCT GAG AAG GAG AGA CAA 
GTT GAG CTC GTG AAA CAC AAG 
CTG AAA GCT GTT ATG GAT GAT 
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TTC GCA GCT TTT GTA GAG AAG TGC TGC AAG GCT GAC GAT AAG 
GAG ACC TGC TTT GCC GAG GAG GGT AAA AAA CTT GTT GCT GCA 
AGT CAA GCT GCC TTA GGC TTA TAA 

wherein the 5' to 3' strand, beginning with the amino 
terminus is shown, and wherein the abbreviations are 
defined as in claim 4 • 

8. A hunan prepco-sertm albunin gene as claixred in claim 7 

c aiiJULl sed in following deoxyribonucleotide sequence: 
5' 



TCTCTTCTGTCAACCCCACGCCTTTGGCACA ATG AAG TGG GTA 



ACC 


TTT 


ATT 


TCC 


CTT 


CTT 


TTT 


CTC 


TTT 


AGC 


TCG 


GCT 


TAT 


TCC 


AGG 


GGT 


GTG 


TTT 


CGT 


CGA 


GAT 


GCA 


CAC 


AAG 


AGT 


GAG 


GTT 


GCT 


CAT 


CGG 


TTT 


AAA 


GAT 


TTG 


GGA 


GAA 


GAA 


AAT 


TTC 


AAA 


GCC 


TTG 


GTG 


TTG 


ATT 


GCC 


TTT 


GCT 


CAG 


TAT 


CTT 


CAG 


CAG 


TGT 


CCA 


TTT 


GAA 


GAT 


CAT 


GTA 


AAA 


TTA 


GTC 


AAT 


GAA 


GTA 


ACT 


GAA 


TTT 


GCA 


AAA 


ACA 


TGT 


GTT 


GCT 


GAT 


GAG 


TCA 


GCT 


GAA 


AAT 


TGT 


GAC 


AAA 


TCA 


CTT 


CAT 


ACC 


CTT 


TTT 


GGA 


GAC 


AAA 


TTA 


TGC 


ACA 


GTT 


GCA 


ACT 


CTT 


CGT 


GAA 


ACC 


TAT 


GGT 


GAA 


ATG 


GCT 


GAC 


TGC 


TGT 


GCA 


AAA 


CAA 


GAA 


CCT 


GAG 


AGA 


AAT 


GAA 


TGC 


TTC 


TTG 


CAA 


CAC 


AAA 


GAT 


GAC 


AAC 


CCA 


AAC 


CTC 


CCC 


CGA 


TTG 


GTG 


AGA 


CCA 


GAG 


GTT 


GAT 


GTG 


ATG 


TGC 


ACT 


GCT 


TTT 


CAT 


GAC 


AAT 


GAA 


GAG 


ACA 


TTT 


TTG 


AAA 


AAA 


TAC 


TTA 


TAT 


GAA 


ATT 


GCC 


AGA 


AGA 


CAT 


CCT 


TAC 


TTT 


TAT 


GCC 


CCG 


GAA 


CTC 


CTT 


TTC 


TTT 


GCT 


AAA 


AGG 


TAT 


AAA 


GCT 


GCT 


TTT 


ACA 


GAA 


TGT 


TGC 


CAA 


GCT 


GCT 


GAT 


AAA 


GCT 


GCC 


TGC 


CTG 


^TG 


CCA 


AAG 


CTC 


GAT 


GAA 


CT*^ 


CGG 


GAT 


GAA 


GGG 


AAG 


GCT 


TCG 


TCT 


GCC 


AAA 


CAG 


AGA 


CTC 


AAG 


TGT 


GCC 


AGT 


CTC 


CAA 


AAA 


TTT 


GGA 


GAA 


AGA 


GCT 


TTC 


AAA 


GCA 


TGG 


GCG 


GTG 


GCT 


CGC 


CTG 


AGC 


CAG 


AGA 


TTT 


ccc 


AAA 


GCT 


GAG 


TTT 


GCA 


GAA 


GTT 


TCC 


AAG 


TTA 


GTG 


ACA 


GAT 


CTT 


ACC 


AAA 


GTC 


CAC 


ACG 


GAA 


TGC 


TGC 


CAT 


GGA 


GAT 


CTG 


CTT 


GAA 


TGT 


GCT 


GAT 


GAC 


AGG 


GCG 


GAC 


CTT 


GCC 


AAG 


TAT 


ATC 


TGT 


GAA 


AAT 


CAA 


GAT 


TCG 


ATC 


TCC 


AGT 


AAA 


CTG 


AAG 


GAA 


TGC 


TGT 


GAA 


AAA 


CCT 


CTG 


TTG 


GAA 


AAA 


TCC 


CAC 


TGC 


ATT 


GCC 


GAA 


GTG 


GAA 


AAT 


GAT 


GAG 


ATG 


CCT 


GCT 


GAC 


TTG 


CCT 


TCA 


TTA 


GCT 


GCT 


GAT 


TTT 


GTT 


GAA 


AGT 


AAG 


GAT 


GTT 


TGC 


AAA 


AAC 


TAT 


GCT 


GAG 


GCA 


AAG 


GAT 


GTC 


TTC 


CTG 


GGC 


ATG 


TTT 
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TTG TAT GAA TAT GCA AGA AGG CAT CCT GAT TAG TOT GTC GTG 
CTG GTG GTG AGA CTT GCC AAG ACA TAT GAA ACC ACT CTA GAG 
AAG TGC TGT GCC GCT GCA GAT CCT CAT GAA TGC TAT GCC AAA 
GTG TTC GAT GAA TTT AAA CCT CCT GTG GAA GAG CCT CAG AAT 
TTA ATC AAA CAA AAT TGT GAG CTT TTT GAG CAG CTT GGA GAG 
TAC AAA TTC CAG AAT GCG CTA TTA GTT CGT TAC ACC AAG AAA 
GTA CCC CAA GTG TCA ACT CCA ACT CTT GTA GAG GTC TCA AGA 
AAC CTA GGA AAA GTG GGC AGC AAA TGT TGT AAA CAT CCT GAA 
GCA AAA AGA ATG CCC TGT GCA GAA GAC TAT CTA TCC GTG GTC 
CTG AAC CAG TTA TGT GTG TTG CAT GAG AAA ACG CCA GTA AGT 
GAC AGA GTC ACC AAA TGC TGC ACA GAA TCC TTG GTG AAC AGG 
CGA CCA TGC TTT TCA GCT CTG GAA GTC GAT GAA ACA TAC GTT 
CCC AAA GAG TTT AAT GCT GAA ACA TTC ACC TTC CAT GCA GAT 
ATA TGC ACA CTT TCT GAG AAG GAG AGA CAA ATC AAG AAA CAA 
ACT GCA CTT GTT GAG CTC GTG AAA CAC AAG CCC AAG GCA ACA 
AAA GAG CAA CTG AAA GCT GTT ATG GAT GAT TTC GCA GCT TTT 
GTA GAG AAG. TGC TGC AAG GCT GAC GAT AAG GAG ACC TGC TTC 
GCC GAG GAG GGT AAA AAA CTT GTT GCT GCA AGT CAA GCT GCC 
TTA GGC TTA TAA CATCTACATTTAAAAGCATCTCAGCCTACCATGAGAATA 
AGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTCTTTTTCGTTGGTG 
TTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAA 
TCTAA 

wherein the 5" to 3' strand, beginning with the auaino 
terminus is shown, and wherein the abbreviations are 
defined as in claim 4. 

9. A plasmid having the capability of replication 
in a prokajryotic or eukaryotic organism, comprising 
a deoxyribo nucleotide sequence coding for hviman 
serum aJ.bumin. 

10. A plasmid as claimed in claim 8 having the 
capability of replication in a prokaryotic organism, 
comprising a human serum albumin or human prepro- 
serum albumin gene as claimed in any one of claims 

1 to 8. 
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11. A plasmid as claimed in claim 9 or claim 10 
having the capability of replication in a prokaryotic 
orgainism of the genus Escherichia. 

12. The plasmid of claim 10 designated pGX401 

5 (deposited in E. coli HBlOl at the U,S. Dept. of 

Agriculture Northern Regional Research Center, 
Peoria, Illinois under access 1 /n No. NRRL B-15784) 
and mutants thereof encoding h,-man serum albumin. 

13. A microorganism transformed by a plasmid as 
10 claimed in any one of claims 9 to 12. 

14. A microorganism as claimed in claim 13 of the 
genus Escherichia . 

15. A microorganism as claimed in claim 14 of the 
species coli . 

A method of producing prepro-human serum albumin 
which comprises cultivating on an aqueous nutrient 
medium containing assimilable sources of carbon, 
nitrogen and essential minerals and growth factors, 
under prepro-human serum albumin-producing conditions^ 
a prokaryotic organism as claimed in claim 13 
transformed by a plasmid capable of replicating 
in said organism and having a deoxyribonucleotide sequence 
coding for prepro-human serum albumin, and 
recovering the prepro-human serum albumin so produced. 

25 17. A method as claimed in claim 16 wherein the 

prokaryotic organism is coli . 
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18. A method as claimed in claim 17 wherein the 
prokaryotic organism is transformed by a plasmid 
substantially similar to plasmid pGX401 as claimed 
in claim 11. 

19. coli strain NRRL No. 15784 (pGX401) or a mutant 
thereof containing a human prepro-hxaman serum 
albumin gene . 



' 0206733 

CLAIMS FOR THE DESIGNATED STATE: AT 

1. A process for preparing a gene coding for human 
serum albumin (HSA) which con^rises obtaining HSA 
mRNA from HSA-producing cells, in vitro synthesis of 
complementary DNA (cDNA) using said mRNA as a template 
and conversion of said cDNA to the double-stranded form. 

2. A process as claimed in claim 1 wherein said gene 
codes for prepro-human serum albumin. 



3. A process as claimed in claim 1 wherein said 
gene comprises the following deoxyribonucleotide 
sequence which corresponds to the indicated amino 
acid sequence: 



Asp 


Ala 


His 


Lys 


Ser 


Glu 


Val 


Ala 


GAY 


G C X 


CAY 


A A M 


Q R S 


GAM 


G T X 


G C X 


His 


Arg 


Phe 


Lys 


Asp 


Leu 


Gly 


Glu 


CAY 


L G N 


T T Y 


A A M 


GAY 


Y T Z 


G G X 


GAM 


Glu 


Asn 


Phe 


Lys 


Ala 


Leu 


Val 


Leu 


GAM 


A A Y 


T T Y 


A A M 


G C X 


Y T Z 


G T X 


Y T Z 


He 


Ala 


Phe 


Ala 


Gin 


Tyr 


Leu 


Gin 


A T H 


G C X 


T T Y 


G C X 


CAM 


T A Y 


y T Z 


CAM 


Gin 


Cys 


Pro 


Phe 


Glu 


Asp 


His 


Val 


CAM 


T G Y 


C C X 


T T Y 


GAM 


GAY 


CAY 


G T X 


Lys 


Leu 


Val 


Asn 


Glu 


Val 


Thr 


Glu 


A A M 


Y T Z 


G T X 


A A Y 


GAM 


G T X 


A C X 


GAM 



Phe Ala 

T T Y G C X 

Glu Ser 

G A M Q R S 

Ser Leu 

Q R S Y T Z 

Lys Leu 

A A M Y T Z 

Arg Glu 

L G N GAM 

Asp Cys 

GAY T G Y 



Lys Thr 

A A M A C X 

Ala Glu 

G C X GAM 

His Thr 

CAY A C X 

Cys Thr 

T G Y A C X 

Thr Tyr 

A C X T A Y 

Cys Ala 

T G Y G C X 



Cys Val 

T G Y G T X 

Asn Cys 

A A Y T G Y 

Leu Phe 

Y T Z T T Y 

Val Ala 

G T X G C X 

Gly Glu 

G G X GAM 

Lys Gin 

A A M CAM 



Ala Asp 

G C X GAY 

Asp Lys 

GAY A A M 

Gly Asp 

G G X GAY 



Thr Leu 

A C X Y T Z 

Met Ala 

A T G G C X 

Glu Pro 

GAM C C X 
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GIu 


Arg 


GAM 


L G N 


His 


Lys 


CAY 


A A M 


Pro 


Arg 


C C X 


L G N 


Asp 


vax 


GAY 


G T X 


Asp 


Asn 


GAY 


A A Y 


Iiys 


Tyr 


A A M 


T A Y 


Acq 




L G N 


CAY 




Leu 


GAM 


Y T Z 


Tvr 




T A Y 


A A M 


Cys 


Ala 


T G Y 


G C X 


Cys 


Leu 


T G Y 


Y T Z 


Ii€U 


Arg 


Y T Z 


L G n 


Co r* 


Ala 


Q R S 


G C X 


t\LcL 


C A 1* 


G C X 


Q R S 


Arg 


Ala 


L G N 


G C X 


Ala 


Arg 


G C X 


L G N 


Lys 


Ala 


A A M 


G C X 


Lys 


Phe 


A A M 


T T Y 


VaL 


His 


G T X 


CAY 



z 

Asn GIu 

A A Y GAM 

Asp Asp 

GAY GAY 

Leu Val 

Y T Z G T X 

Met Cys 

A T G T G Y 

GIu GIu 

GAM GAM 

Leu Tyr 

Y T Z T A Y 

Pro Tyr 

OCX T A Y 

Leu Phe 

Y T Z T T Y 

Ala Ala 

G C X G C X 

Gin Ala 

CAM G C X 

Phe Pro 

T T Y C C X 

Asp GIu 

GAY GAM 

Lys Gin 

A A M CAM 

Leu Gin 

Y T Z CAM 

Phe Lys 

T T Y A A M 

Leu Ser 

Y T Z Q R S 

Glu Phe 

GAM T T Y 

Val Thr 

G T X A C X 

Thr GLu 

A C X GAM 



Cys 


Phe 


T G Y 


T T Y 


Asn 


Pro 


A A Y 


C C X 


Arg 


ir ro 


L G N 


C C X 


Thr 


Ala 


A C X 


G C X 


Thr 


Phe 


A C X 


T T Y 


Glu 


lie 


GAM 


A T H 


irne 


Thr 


T T y 


A C X 


Jrne 


Axa 


T T y 


G C X 


flits 




T T Y 


A C X 


Asp 


Lys 


GAY 


A A M 


Lys 


Leu 


A A M 


Y T Z 


Gly 


Lys 


G G X 


A A M 


Arg 


Leu 


L G N 


Y T Z 


Lys 


Phe 


A A M 


T T Y 


Ala 


Trp 


G C X 


T G G 


Gin 


Arg 


CAM 


L G N 


Ala 


Glu 


G C X 


GAM 


Asp 


Leu 


GAY 


Y T Z 


Cys 


Cys 


T G Y 


T G Y 



T A ■« 

Leu 


Gin 


Y T Z 


CAM 


Asn 


Leu 


A A Y 


y T 2 


Gl n 

VJX Ul 


VaX 


GAM 


G T X 


Pne 


His 


T T Y 


CAY 


lieu 


Lys 


y T z 


A A M 


Ala 


Arg 


G C X 


L G N 


Ala 


Pro 


G C X 


C C X 


Lys 


Arg 


A A M 


L G N 


(jIU 


Cys 


GAM 


T G Y 


Aia 


Ala 


G C X 


G C X 


Asp 


Glu 


GAY 


GAM 


Ala 


Ser 


G C X 


Q R S 


Lys 


Cys 


A A M 


T G Y 


Gly 


Glu 


G G X 


GAM 


Ala 


Val 


G C X 


G T X 


Phe 


Pro 


T T Y 


c C X 


Val 


Ser 


G T X 


Q R S 


Thr 


Lys 


A C X 


A A M 


His 


Gly 


CAY 


G G X 
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Asp 


Leu 


u A X 


X T 2 


Arg 


Ala 


Xj b N 


G C X 


Cys 


Glu 


T G Y 


GAM 


Ser 


Lys 


Q R S 


A A M 


hys 


Pro 


A A M 


OCX 


Cys 


He 


T G Y 


A T a 


Glu 


Met 


GAM 


A T G 


Phe 


Ala 


T T Y 


G C X 


Lys 


Asp 


A A M 


GAY 


Glu 


Ala 


GAM 


G C X 


Met 


Phe 


A T G 


T T Y 


Arg 


His 


L G N 


CAY 


Leu 


Leu 


Y T 2 


Y T Z 


Tyr 


Glu 


T A Y 


GAM 


Cys 


Ala 


T G Y 


G C X 


Cys 


Tyr 


T G Y 


T A Y 


Phe 


Lys 


T T Y 


A A M 



Leu Glu 

Y T Z GAM 

Asp Leu 

GAY y T Z 

Asn Gin 

A A Y CAM 

lieu Lys 

Y T 2 A A M 

I-eu Phe 

Y T Z T T Y 

Ala Glu 

G C X GAM 

Pro Ala 

C C X G C X 

Val Asp 

G T X G AY 

Val Cys 

G T X T G Y 

Lys Asp 

A A M GAY 

Phe Tyr 

T T Y T A Y 

Pro Asp 

C C X GAY 

Leu Arg 

Y T Z L G N 

Thr Thr 

A C X A C X 

Ala Ala 

G C X G C X 

Ala Lys 

G C X A A M 

Pro Pro 

C C X C C X 



Cys 
T G Y 


Ala 
G C X 


Ala 
G C X 


Lys 
A A M 


Asp 
GAY 


Ser 
Q R S 


Glu 
GAM 


Cys 
T G Y 


Glu 
GAM 


Lys 
A A M 


Val 
G T X 


Glu 
GAM 


Asp 
GAY 


Phe 
T T Y 


Phe 
T T Y 


Val 
G T X 


Lys 
A A M 


Asn 
A A Y 


Val 
G T X 


Phe 
T T Y 


Glu 
GAM 


Tyr 
T A Y 


Tyr 
T A Y 


Ser 
Q R S 


Leu 
Y T Z 


Ala 
G C X 


Leu 
Y T Z 


Glu 
GAM 


Asp 
GAY 


Pro 
C C X 


Val 
G T X 


Phe 
T T Y 


Val 
G T X 


Glu 
GAM 



Asp 


Asp 


GAY 


GAY 


Tyr 


lie 


T A y 


A T H 


He 


Ser 


A T H 


Q R S 


Cys 


Glu 


T G Y 


GAM 


Ser 


His 


Q R S 


CAY 


Asn 


Asp 


A A y 


GAY 


Pro 


Ser 


C C X 


Q R S 


Glu 


Ser 


GAM 


Q R S 


Tyr 


Ala 


T A Y 


G C X 


Leu 


Gly 


Y T Z 


G G X 


Ala 


Arg 


G C X 


L G N 


Val 


Val 


G T X 


G T X 


Lys 


Thr 


A A M 


A C X 


Lys 


Cys 


A A M 


T G Y 


His 


Glu 


CAY 


GAM 


Asp 


Glu 


GAY 


GAM 


Glu 


Pro 


GAM 


C C X 
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Gin Asn Phe He* 

CAM AAY TTY ATH 

Glu Leu Phe Glu 

GAM YTZ TTY GAM 

Tyr Lys Phe Gin 

TAY AAH TTY CAM 

Val Arg Tyr Thr 

GTX LGN TAY ACX 



Gin 


Leu 


Ser 


Thr 


CAM 


Y T Z 


Q R S 


ACX 


Glu 


Val 


Ser 


Arg 


GAM 


GTX 


Q R S 


L G H 


Val 


Gly 


Ser 


Lys 


GTX 


G G X 


Q R S 


A A M 


Pro 


Glu 


Ala 


Lys 


C C X 


GAM 


G C X 


A A M 


Ala 


Glu 


Asp 


Tyr 


G C X 


GAM 


GAY 


TAY 


Leu 


Asn 


Gin 


Leu 


Y T Z 


AAY 


CAM 


Y T Z 


Glu 


Lys 


Thr 


Pro 


GAM 


A A M 


ACX 


C C X 


Val 


Thr 


Lys 


Cys 


GTX 


ACX 


A A M 


T G Y 


Leu 


Val 


Asn 


Arg 


Y T Z 


GTX 


AAY 


LGN 


Ser 


Ala 


Leu 


Glu 


Q R S 


G C X 


Y T Z 


GAM 



Tyr Val Pro Lys 
TAY GTX CCX AAM 



Glu Thr Phe Thr 

GAM ACX TTY ACX 

He Cys Thr Leu 

ATH TGY ACX YTZ 



Lys Gin Asn Cys 

AAM CAM AAY TGY 

Gin Leu Gly Glu 

CAM YTZ GGX GAM 

Asn Ala Leu Phe 

AAY GCX YTZ TTY 

Lys Lys Val Pro 

AAM AAM GTX CCX 

Pro Thr Leu Val 

CCX ACX YTZ GTX 

Asn Leu Gly Lys 

AAY YTZ GGX AAM 

Cys Cys Lys His 

TGY TGY AAM CAY 

Arg Met Pro Cys 

LGN ATG CCX TGY 

Leu Ser Val Val 

YTZ QRS GTX GTX 

Cys Val Leu Bis 

TGY GTX YTZ CAY 

Val Ser Asp Arg 

GTX QRS GAY LGN 

Cys Thr Glu Ser 

TGY ACX GAM QRS 

Arg Pro Gly Phe 

LGN CCX GGX TTY 

Val Asp Glu Thr 

GTX GAY GAM ACX 

Glu Phe Asn Ala 

GAM TTY AAY GCX 

Phe Bis Ala Asp 

TTY CAY GCX GAY 



Ser Glu Lys Glu 
QRS GAM AAM GAM 
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Arg 
L G N 


Gin 
CAM 


lie 

A T H 


Lys 
A A M 


Lys 
A A M 


Glu 
GAM 


Thr 
A C X 


Ala 
G C X 


Leu 
Y T Z 


Val 
G T X 


Glu 
GAM 


Leu 
Y T Z 


Val 
G T X 


Lvs 
A A M 


His 
CAY 


■ Uj 9 

A A M 


Pro 
C C X 


Lys 
A A M 


Ala 
G C X 


Thr 
A C X 


Lys 
A A M 


Glu 
GAM 


Glu 
GAM 


Y T Z 


Lys 
A A N 


Ala 
G C X 


Val 
G T X 


Met 
A T G 


ASD 

GAY 


Asp 
GAY 


T T Y 


Axa 
G C X 


Ala 
G C X 


Phe 
T T y 


val 
G T X 


Glu 
GAM 


Lvs 
A A M 


Cvs 
T G Y 


T G Y 


Lys 
A A M 


G C X 


ASp 

GAY 


Asp 
GAY 


Lys 
A A M 


Glu 
GAM 


Thr 
A C X 


Cys 
TOY 


Phe 
T T Y 


Ala 
G C X 


Glu 
GAM 


Glu 
GAM 


Gly 
G G X 


Lys 
A A M 


Lys 
A A M 


Leu 
Y T Z 


Val 
G T X 


Ala 
G C X 


Ala 
G C X 


Ser 
Q R S 


Glu 
GAM 


Ala 
G C X 


Val 
G T X 


Leu 
Y T Z 


Gly 
G G X 



Leu 

Y T Z T A A 

wherein^ the 5' to 3' strand, beginning with the amino 
terminus and the amino acids for which each triplet codes 
are shown, and wherein the abbreviations have the 
following standard meanings: 

A is deoxyadenyl 

T is thymidyl 

G is deoxyguanyl 

C is deoxycytosyl 

X is A, T, C or G 

Y is T or C 

When Y is C, Z is A, T, C or G 
When Y is T, Z is A or G 
H is Ar T or C 
Q is T or A 

When Q is T, R is C and S is A, T, C or G 
\<herx Q is A, R is G and S is T or C 
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M is A or G 
Tj is A or C 

When L is A, N is A or G 

When Ii is Cr N is A, T, C or G 

GLY is glycine 

ALA is ad. an in e 

VAL is valine 

LEU is leucine 

ILE is isoleucine 

SER is serine 

THR is threonine 

?HB is phenylalanine 

TYR is tyrosine 

TRP is tyryptophan 

CYS is cysteine 

MET is methionine 

ASP is aspartic acid 

GLO is glutamic acid 

LYS is lysine 

ARG is arginine 

BIS is histidine 

PRO is proline 

GLN is glutamine 

ASN is asparagine 

4. A process as claimed in claim 2 -therein said 
gene comprises the following deoxyribonucleotide 
sequence : 

Met Lys Trp Val Thr Phe 
ATG AAM TGG GTX ACX TTY 



lie 

A T a 

Ser 
Q R S 



Ser 
Q R S 

Ser 
Q R S 



Leu 
Y T 2 

Ala 
G C X 



Leu 
Y T Z 

Tyr 
T A Y 



Phe 
TTY 

Ser 
Q R S 



Leu 
Y T Z 

Arg 
L G N 



Phe 
TTY 

Gly 
G G X 



Val 
GTX 



Phe Arg Arg Asp Ala 
TTY LGN LGN GAY GCX 



His Lys 
CAY AAM 



Ser Glu 

Q R S GAM 

Asp Leu 

GAY Y T Z 

Ala Leu 

G C X Y T Z 

Gin Tyr 

CAM T A Y 

Glu Asp 

GAM GAY 

Glu Val 

GAM G T X 

Cys Val 

T G Y G T X 

Asn Cys. 

A A Y T G Y 

Leu Phe 

Y T Z T T Y 

Val Ala 

G T X G C X 

Gly Glu 

G G X GAM 

Lys Gin 

A A M CAM 

Cys Phe 

T G Y T T Y 

Asn Pro 

A A Y C C X 

Arg Pro 

L G N C C X 

Thr Ala 

A C X G C X 

Thr Phe 

A C X T T Y 
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Val 


Ala 


G T X 


G C X 


Gly 


Glu 


n G X 

VJ U A 


GAM 
vj n n 


Val 


Leu 


o rn v 


X 1 A 


Leu 


Gin 


1. X it 


r* A M 
u A n 


His 


Val 


o a V 


m V 
b X A 


Thr 


Glu 


A k« A 


2L M 

Vs A n 


Ala 


Asp 


V 

VJ C A 


u A X 


Asp 


Lys 


a V 

\j A X 


& A M 
Ann 


Gly 


Asp 


V 

l3 A 


u A X 


Thr 


Leu 


A C A 


V 7 
X 1 A 


Met 


Ala 


A T G 


o <^ v 
G C A 


Glu 


Pro 


GAM 


C C X 


Leu 


Gin 


Y T Z 


CAM 


Asn 


Leu 


A A X 


X X 6 


Glu 


Val 


GAM 


G T X 


Phe 


His 


T T Y 


CAY 


Leu 


Lys 


Y T Z 


A A M 



His Arg 

CAY L G N 

Glu Asn 

GAM A A Y 

He Ala 

A T H G C X 

Gin Cys 

CAM T G Y 

Lys Leu 

A A M Y T Z 

Phe Ala 

T T Y G C X 

Glu Ser 

GAM Q R S 

Ser Leu 

Q R S Y T Z 

Lys X<eu 

A A M Y T Z 

Arg Glu 

L G N GAM 



Asp Cys 

GAY T G Y 

Glu Arg 

GAM L G N 

His Lys 

CAY A A M 

Pro Arg 

C C X L G N 

ASD Val 

GAY G T X 

Asp Asn 

GAY A A Y 

Lys Tyr 

A A M T A Y 
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Phe 


Lys 


T T Y 


A A M 


Phe 


Lys 


T T Y 


A A M 
r\ t\ 


Phe 


Ala 


m m Y 


G C X 


Pro 


Phe 


r» V 
u c A 


T T V 
XIX 


Val 


Asn 


G T X 


A A X 


Lys 


Thr 


A A n 


A <^ Y 
A A 


Ala 


Glu 


G C X 


GAM 


His 


Thr 


CAY 


A C X 


Cys 


Thr 


T G V 

X U X 


A r Y 

n v« A 


Thr 


Tyr 


A C X 

n Va> A 


T A Y 
X n X 


Cys 


Ala 


m \/ 

T Ij X 


v 

Vj U A 


Asn 


Glu 


AAV 
A A X 


^ A k4 

vj A n 


Asp 


Asp 


GAY 


GAY 


Leu 


Val 


V rp 7 
X X « 


f2 T Y 
Vs X A 


Met 


Cys 


A T G 


T G Y 


Glu 


Glu 


GAM 


GAM 


Leu 


Tyr 


Y T Z 


T A Y 
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Giu 
GAM 


He 
A T H 


Ala 
G C X 


Arg 
L G N 


Phe 
T T Y 


Thr 
A C X 


Ala 
G C X 


Pro 
C C X 


Phe 
T T Y 


Ala 
G C X 


Lys 
A A M 


Arg 
L G N 


Phe 
T T Y 


Thr 
A C X 


Giu 
GAM 


Cys 
T G Y 


Asp 
GAY 


Lys 
A A M 


Ala 
G C X 


Ala 
G C X 


Lys 
A A M 


Leu 
y T Z 


Asp 
GAY 


Giu 
GAM 


GLy 
G G X 


Lys 
A A M 


Ala 
G C X 


Ser 
Q R S 


Arg 
I. G N 


Leu 
Y T Z 


Lys 
A A M 


Cys 
T G Y 


Lys 
A A M 


Phe 
T T Y 


Gly 
G G X 


Giu 
GAM 


Ala 
G C X 


Trp 
T G G 


Ala 
G C X 


Val 
G T X 


Gin 
CAM 


Arg 
L G N 


Phe 
T T Y 


Pro 
C C X 


Ala 
G C X 


Giu 
GAM 


Val 
G T X 


Ser 
Q R S 


Asp 
GAY 


Leu 
Y T Z 


Thr 
A C X 


Lys 
A A M 


Cys 
T G Y 


Cys 
T G Y 


Bis 
CAY 


Gly 
G G X 


Cys 
T G Y 


Ala 
G C X 


Asp 
GAY 


Asp 
GAY 


Ala 

G C X 


Lys 

A. A M 


Tyr 
T A Y 


He 
A T H 


Asp 
GAY 


Ser 
Q R S 


He 
A T H 


Ser 
Q R S 



Arg His Pro Tyr 

LGN CAY CCX TAY 

Giu Leu Leu Phe 

GAM YTZ YTZ TTY 

Tyr Lys Ala Ala 

TAY AAM GCX GCX 

Cys Ala Gin Ala 

TGY GCX CAM GCX 

Cys Leu Phe Pro 

TGY YTZ TTY CCX 

Leu Arg Asp Giu 

YTZ LGN GAY GAM 

Ser Ala Lys Gin 

QRS GCX AAM CAM 

Ala Ser Leu Gin 

GCX QRS YTZ CAM 

Arg Ala Phe Lys 

LGN GCX TTY AAM 

Ala Arg Leu Ser 

GCX LGN YTZ QRS 

Lys Ala Giu Phe 



AAM 


GCX 


GAM 


TTY 


Lys 


Phe 


Val 


Thr 


AAM 


TTY 


G T X 


A C X 


Val 


His 


Thr 


Giu 


G T X 


CAY 


A C X 


GAM 


Asp 


Leu 


Leu 


Giu 


GAY 


YTZ 


YTZ 


GAM 


Arg 


Ala 


Asp 


Leu 


LGN 


GCX 


GAY 


YTZ 


Cys 


Giu 


Asn 


Gin 


TGY 


GAM 


A A Y 


CAM 


Ser 


Lys 


Leu 


Lys 


QRS 


AAM 


YTZ 


AAM 



Glu 
GAM 


Cys 
T G Y 


Glu 
GAM 


Lys 
A A M 


Val 
G T X 


Glu 
GAM 


Asp 
GAY 


Phe 
T T Y 


Phe 
T T Y 


Val 
G T X 


Lys 
A A M 


Asn 
A A Y 


Val 
G T X 


Phe 
T T Y 


Glu 
GAM 


Tyr 
T A Y 


Tyr 
T A Y 


Ser 
Q R S 


Leu 
Y T Z 


Ala 
G C X 


Leu 
Y T Z 


Glu 
GAM 


Asp 
GAY 


Pro 
C C X 


Val 
G T X 


Phe 
T T Y 


Val 
G T X 


Glu 
GAM 


Lys 
A A M 


Gin 
CAM 


Gin 
CAM 


Leu 
Y T Z 


Asn 
A A Y 


Ala 
G C X 



Cys 
T G Y 


Glu 
GAM 


Ser 
Q R S 


His 
CAY 


Asn 
A A Y 


Asp 
GAY 


Pro 
C C X 


Ser 
Q R S 


Glu 
GAM 


Ser 
Q R S 


Tyr 
T A Y 


Ala 
G C X 


Leu 
Y T Z 


Gly 
G G X 


Ala 
G C X 


Arg 
L G N 


Val 
G T X 


Val 
G T X 


Lys 
A A M 


Thr 
A C X 


Lys 
A A M 


Cys 
T G Y 


His 
CAY 


Glu 
GAM 


Asp 
GAY 


Glu 
GAM 


Glu 
GAM 


Pro 
C C X 


Asn 
A A Y 


Cys 
T G Y 


Gly 
G G X 


Glu 
GAM 


Leu 
Y T Z 


Phe 
T T Y 



Lys 
A A M 


Pro 
C C X 


Cys 
T G Y 


He 
A T H 


Glu 
GAM 


Met 
A T G 


Phe 
T T Y 


Ala 
G C X 


Lys 
A A M 


Asp 
GAY 


Glu 
GAM 


Ala 
G C X 


Met 
A T G 


Phe 
T T Y 


Arg 
L G N 


His 
CAY 


Leu 
Y T Z 


Leu 
Y T Z 


Tyr 
T A Y 


Glu 
GAM 


Cys 
T G Y 


Ala 
G C X 


Cys 
T G Y 


Tyr 
T A Y 


Phe 
T T Y 


Lys 
A A M 


Gin 
CAM 


Asn 
A A Y 


Glu 
GAM 


Leu 
Y T Z 


Tyr 
T A Y 


Lys 
A A M 


Val 
G T X 


Arg 
L G N 
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Leu 
Y T Z 


Phe 

T T Y 


Ala 
G C X 


Glu 
GAM 


Pro 
C C X 


Ala 
G C X 


Val 
G T X 


Asp 
GAY 


Val 
G T X 


Cys 
T G Y 


Lys 
A A M 


Asp 
GAY 


Phe 
T T Y 


Tyr 
T A Y 


Pro 
C C X 


Asp 
GAY 


Leu 
Y T Z 


Arg 
L G N 


Thr 
A C X 


Thr 
A C X 


Ala 
G C X 


Ala 
G C X 


Ala 
G C X 


Lys 
A A M 


Pro 
C C X 


Pro 
C C X 


Phe 
T T Y 


He 
A T H 


Phe 
T T Y 


Glu 
GAM 


Phe 
T T Y 


Gin 
CAM 


Tyr 

T A y 


Thr 
A C X 
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Lys Lys Val Pro 

AAM AAM GTX CCX 

Pro Thr Leu Vai 

CCX ACX YTZ GTX 

Asn Leu Gly Lys 

AAY YTZ GGX AAM 

Cys Cys Lys His 

TGY TGY AAM CAY 

Arg Met Pro Cys 

LGN ATG CCX TGY 

Leu Ser Val Val 

YTZ QRS GTX GTX 

Cys Val Leu His 

TGY GTX YTZ CAY 

Val Ser Asp Arg 

GTX QRS GAY LGN 

Cys Thr Glu Ser 

TGY ACX GAM QRS 

Arg Pro Gly Phe 

LGN CCX GGX TTY 

Val Asp Glu Thr 

GTX GAY GAM ACX 

Glu Phe Asn Ala 

GAM TTY AAY GCX 

Phe His Ala Asp 

TTY CAY GCX GAY 

Ser Glu Lys Glu 

QRS GAM AAM GAM 

Lys Glu Thr Ala 

AAM GAM ACX GCX 

Val Lys His Lys 

GTX AAM CAY AAM 

Lys Glu Glu Leu 

AAM GAM GAM YTZ 
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Gin 
CAM 


Leu 
YTZ 


Ser 
QRS 


Thr 
ACX 


Glu 
GAM 


Val 
GTX 


Ser 
QRS 


Ar <3 
LGN 


Val 
GTX 


Gly 
GGX 


Ser 
QRS 


Lys 
AAM 


Pro 
CCX 


Glu 
GAM 


Ala 
GCX 


Lys 
AAM 


Ala 
GCX 


Glu 
GAM 


Asp 
GAY 


Tyr 
T A Y 


Leu 
YTZ 


Asn 
AAY 


Gin 
CAM 


Leu 
YTZ 


Glu 
GAM 


Lvs 
AAM 


Thr 
ACX 


Pro 
CCX 


Val 
GTX 


Thr 
ACX 


Lvs 
AAM 


Cvs 
TGY 


Leu 
YTZ 


Val 
GTX 


Asn 

AAY 


Arg 
LGN 


Ser 
QRS 


Ala 
GCX 


Leu 
YTZ 


Glu 
GAM 


Tv r 
T A Y 


Val 
GTX 


Pro 
CCX 


Lvs 
AAM 


Glu 
GAM 


Thr 
ACX 


Phe 
TTY 


Thr 
ACX 


lie 
A T H 


Cys 
TGY 


Thr 
ACX 


Leu 
YTZ 


Arg 
LGN 


Gin 
CAM 


lie 
A T H 


Lys 
AAM 


Leu 
YTZ 


Val 
GTX 


Glu 
GAM 


Leu 
YTZ 


Pro 
CCX 


Lys 
AAM 


Ala 
GCX 


Thr 
ACX 


Lys 
AAM 


Ala 
GCX 


Val 
GTX 


Met 
ATG 



Asp Asp Phe Ala Ala Phe Val Glu 
GAY GAY TTY GCX GCX TTY GTX GAM 
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Lys 
A A M 


Cys 
T G Y 


Cys 
T G y 


Lys 
A A M 


Ala 

n r X 


Asp 


Asp 

u A 1 


Lys 
A A M 


Glu 
GAM 


Thr 
A C X 


Cys 
T G Y 


Phe 
T T Y 


Ala 
G C X 


Glu 
GAM 


Glu 
GAM 


Gly 
G G X 


Lys 
A A M 


Lys 
A A M 


Leu 
y T Z 


Val 
G T X 


Ala 
G C X 


Ala 
G C X 


Ser 
Q R S 


Glu 
GAM 


Ala 
G C X 


Val 
G T X 


Leu 
Y T Z 


Gly 
G G X 


Leu 
Y T Z 


T A A 







wherein the 5' and 3' strand, beginning with the amino 
terminus/ and the amino acids for which each triplet 
codes are shown, and wherein the abbreviations are 
defined as in claim 3. 

5. A process as claimed in claim 3 wherein said 
gene comprises the following deoxyribonucleotide 
sequence: 

GAT GCA CAC AAG AGT GAG GTT GCT CAT CGG TTT AAA GAT TTG 
GGA GAA GAA AAT TTC AAA GCC TTG GTG TTG ATT GCC TTT GCT 
CAG TAT CTT CAG CAG TGT CCA TTT GAA GAT CAT GTA AAA TTA 
GTG AAT GAA GTA ACT GAA TTT GCA AAA AC A TGT GTT GCT GAT 
GAG TCA GCT GAA AAT TGT GAC AAA TCA CTT CAT ACC CTT TTT 
GGA GAC AAA TTA TGC ACA GTT GCA ACT CTT CGT GAA ACC TAT 
GGT GAA ATG GCT GAC TGC TGT GCA AAA CAA GAA CCT GAG AGA 
AAT GAA TGC TTC TTG CAA CAC AAA GAT GAC AAC CCA AAC CTC 
CCC CGA TTG GTG AGA CCA GAG GTT GAT GTG ATG TGC ACT GCT 
TTT CAT GAC AAT GAA GAG ACA TTT TTG AAA AAA TAC TTA TAT 
GAA ATT GCC AGA AGA CAT CCT TAC TTT TAT GCC CCG GAA CTC 
CTT TTC TTT GCT AAA AGG TAT AAA GCT GCT TTT ACA GAA TGT 
TGC CAA GCT GCT GAT AAA GCT GCC TGC CTG TTG CCA AAG CTC 
GAT GAA CTT CGG GAT GAA GGG AAG GCT TCG TCT GCC AAA CAG 
AGA CTC AAG TGT GCC AGT CTC CAA AAA TTT GGA GAA AGA GCT 
TTC AAA GCA TGG GCG GTG GCT CGC CTG AGC CAG AGA TTT CCC 
AAA GCT GAG TTT GCA GAA GTT TCC AAG TTA GTG ACA GAT CTT 
ACC AAA GTC CAC ACG GAA TGC TGC CAT GGA GAT CTG CTT GAA 
TGT GCT GAT GAC AGG GCG GAC CTT GCC AAG TAT ATC TGT GAA 
AAT CAA GAT TCG ATC TCC AGT AAA CTG AAG GAA TGC TGT GAA 
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AAA 


CCT 


CTG 


TTG 


GAA 


AAA 


TCC 


CAC 


TGC 


ATT 


GCC 


GAA 


GTG 


GAA 


AAT 


GAT 


GAG 


ATG 


CCT 


GCT 


GAC 


TTG 


CCT 


TCA 


TTA 


GCT 


GCT 


GAT 


TTT 


GTT 


GAA 


AGT 


AAG 


GAT 


GTT 


TGC 


AAA 


AAC 


TAT 


GCT 


GAG 


GCA 


AAG 


GAT 


GTC 


TTC 


CTG 


GGC 


ATG 


TTT 


TTG 


TAT 


GAA 


TAT 


GCA 


AGA 


AGG 


CAT 


CCT 


GAT 


TAC 


TCT 


GTC 


GTG 


CTG 


CTG 


CTG 


AGA 


CTT 


GCC 


AAG 


ACA 


TAT 


GAA 


ACC 


ACT 


CTA 


GAG 


AAG 


TGC 


TGT 


GCC 


GCT 


GCA 


GAT 


CCT 


CAT 


GAA 


TGC 


TAT 


GCC 


AAA 


GTG 


TTC 


GAT 


GAA 


TTT 


AAA 


CCT 


CCT 


GTG 


GAA 


GAG 


CCT 


CAG 


AAT 


TTA 


ATC 


AAA 


CAA 


AAT 


TGT 


GAG 


CTT 


TTT 


GAG 


CAG 


CTT 


GGA 


GAG 


TAC 


AAA 


TTC 


CAG 


AAT 


GCG 


CTA 


TTA 


GTT 


CGT 


TAC 


ACC 


AAG 


AAA 


GTA 


CCC 


CAA 


GTG 


TCA 


ACT 


CCA 


ACT 


CTT 


GTA 


GAG 


GTC 


TCA 


AGA 


AAC 


CTA 


GGA 


AAA 


GTG 


GGC 


AGC 


AAA 


TGT 


TGT 


AAA 


CAT 


CCT 


GAA 


GCA 


AAA 


AGA 


ATG 


CCC 


TGT 


GCA 


GAA 


GAC 


TAT 


CTA 


TCC 


GTG 


GTC 


CTG 


AAC 


CAG 


TTA 


TGT 


GTG 


TTG 


CAT 


GAG 


AAA 


ACG 


CCA 


GTA 


AGT 


GAC 


AGA 


GTC 


ACC 


AAA 


TGC 


TGC 


ACA 


GAA 


TCC 


TTG 


GTG 


AAC 


AGG 


CGA 


CCA 


TGC 


TTT 


TCA 


GCT 


CTG 


GAA 


GTC 


GAT 


GAA 


ACA 


TAC 


GTT 


CCC 


AAA 


GAG 


TTT 


AAT 


GCT 


GAA 


ACA 


TTC 


APC 


TTC 


CAT 


GCA 


GAT 


ATA 


TGC 


ACA 


CTT 


TCT 


GAG 


AAG 


GAG 


AGA 


CAA 


ATC 


AAG 


AAA 


CAA 


ACT 


GCA 


CTT 


GTT 


GAG 


CTC 


GTG 


AAA 


CAC 


AAG 


ccc 


AAG 


GCA 


ACA 


AAA 


GAG 


CAA 


CTG 


AAA 


GCT 


GTT 


ATG 


GAT 


GAT 


TTC 


GCA 


GCT 


TTT 


GTA 


GAG 


AAG 


TGC 


TGC 


AAG 


GCT 


GAC 


GAT 


AAG 


GAG 


ACC 


TGC 


TTT 


GCC 


GAG 


GAG 


GGT 


AAA 


AAA 


CTT 


GTT 


GCT 


GCA 


AGT 


CAA 


GCT 


GCC 


TTA 


GGC 


TTA 


TAA 







wherein the 5' to 3' strand, beginning with the amino 
terminus is shown / and wherein the abbreviations are 
defined as in claim 3. 



6, A process as claimed in claim 4 wherein said 
gene comprises the following deoxyribo nucleotide 
.sequence: 

ATG AAG TGG GTA ACC TTT ATT TCC CTT CTT TTT CTC TTT AGC 
TCG GCT TAT TCC AGG GGT GTG TTT CGT CGA GAT GCA CAC AAG 
AGT GAG GTT GCT CAT CGG TTT AAA GAT TTG GGA GAA GAA AAT 
TTC AAA GCC TTG GTG TTG ATT GCC TTT GCT CAG TAT CTT CAG 
CAG TGT CCA TTT GAA GAT CAT GTA AAA TTA GTG AAT GAA GTA 
ACT GAA TTT GCA AAA ACA TGT GTT GCT GAT GAG TCA GCT GAA 
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AAT TGT GAC AAA TCA CTT CAT ACC CTT TTT GGA GAC AAA TTA 
TGC ACA GTT GCA ACT CTT CGT GAA ACC TAT GGT GAA ATG GCT 
GAC TGC TGT GCA AAA CAA GAA CCT GAG AGA AAT GAA TGC TTC 
TTG CAA CAC AAA GAT GAC AAC CCA A AC CTC CCC CGA TTG GTG 
AGA CCA GAG GTT GAT GTG ATG TGC ACT GCT TTT CAT GAC AAT 
GAA GAG ACA TTT TTG AAA AAA TAC TTA TAT GAA ATT GCC AGA 
AGA CAT CCT TAC TTT TAT GCC CCG GAA CTC CTT TTC TTT GCT 
AAA AGG TAT AAA GCT GCT TTT ACA GAA TGT TGC CAA GCT GCT 
GAT AAA GCT GCC TGC CTG TTG CCA AAG CTC GAT GAA CTT CGG 
GAT GAA GGG AAG GCT TCG TCT GCC AAA CAG AGA CTC AAG TGT 
GCC AGT CTC CAA AAA TTT GGA GAA AGA GCT TTC AAA GCA TGG 
GCG GTG GCT CGC CTG AGC CAG AGA TOT CCC AAA GCT GAG TTT 
GCA GAA GTT TCC AAG TTA GTG ACA GAT CTT ACC AAA GTC CAC 
ACG GAA TGC TGC CAT GGA GAT CTG CTT GAA TGT GCT GAT GAC 
AGG GCG GAC CTT GCC AAG TAT ATC TGT GAA AAT CAA GAT TCG 
ATC TCC AGT AAA CTG AAG GAA TGC TGT GAA AAA CCT CTG TTG 
GAA AAA TCC CAC TGC ATT GCC GAA GTG GAA AAT GAT GAG ATG 
CCT GCT GAC TTG CCT TCA TTA GCT GCT GAT TTT GTT GAA AGT 
AAG GAT GTT TGC AAA AAC TAT GCT GAG GCA AAG GAT GTC TTC 
CTG GGC ATG TTT TTG TAT GAA TAT GCA AGA AGG CAT CCT GAT 
TAC TCT GTC GTG CTG CTG CTG AGA CTT GCC AAG ACA TAT GAA 
ACC ACT CTA GAG AAG TGC TGT GCC GCT GCA GAT CCT CAT GAA 
TGC TAT GCC AAA GTG TTC GAT GAA TTT AAA CCT CCT GTG GAA 
GAG CCT CAG AAT TTA ATC AAA CAA AAT TGT GAG CTT TTT GAG 
CAG CTT GGA GAG TAC AA - ?TC CAG AAT GCG CTA TTA GTT CGT 
TAC ACC AAG AAA GTA CC . CAA GTG TCA ACT CCA ACT CTT GTA 
GAG GTC TCA AGA AAC CT/ ^GA AAA GTG GGC AGC AAA TGT T JT 
AAA CAT CCT GAA GCA AAA AGA ATG CCC TGT GCA GAA GAC TAT 
CTA TCC GTG GTC CTG AAC CAG TTA TGT GTG TTG CAT GAG AAA 
ACG CCA GTA AGT GAC AGA GTC ACC AAA TGC TGC ACA GAA TCC 
TTG GTG AAC AGG CGA CCA TGC TTT TCA GCT CTG GAA GTC GAT 
GAA ACA TAC GTT CCC AAA GAG TTT AAT GCT GAA ACA TTC ACC 
TTC CAT GCA GAT ATA TGC ACA CTT TCT GAG AAG GAG AGA CAA 
ATC AAG AAA CAA ACT GCA CTT GTT GAG CTC GTG AAA CAC AAG 
CCC AAG GGA ACA AAA GAG CAA CTG AAA GCT GTT ATG GAT GAT 
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TTC GCA GCT TTT GTA GAG AAG TGC TGC AAG GCT GAC GAT AAG 
GAG ACC TGC TTT GCC GAG GAG GGT AAA AAA CTT GTT GCT GCA 
AGT CAA GCT GCC TTA GGC TTA TAA 

wherein the 5* to 3* strand, beginning with the amino 
terminus is shown r and v^erein the abbreviations are 
defined as in claim 3. 

7, A process as claimed in claim 6 wherein said 
gene is comprised in the following deoxyribonucleotide 
sequence: 
S* 

TCTCTTCTGTCAACCCCACGCCTTTGGCACA ATG AAG TGG GTA 



ACC 


TTP 


ATT 


TCC 


CTT 


CTT 


TTT 


CTC 


TIT 


AGC 


TCG 


GCT 


TAT 


TCC 


AGG 


GGT 


GTG 


TTT 


CGT 


CGA 


GAT 


GCA 


CAC 


AAG 


AGT 


GAG 


GTT 


GCT 


CAT 


CGG 


TTT 


AAA 


GAT 


TTG 


GGA 


GAA 


GAA 


AAT 


TTC 


AAA 


GCC 


TTG 


GTG 


TTG 


ATT 


GCC 


TTT 


GCT 


CAG 


TAT 


CTT 


CAG 


CAG 


TGT 


CCA 


TTT 


GAA 


GAT 


CAT 


GTA 


AAA 


TTA 


GTC 


AAT 


GAA 


GTA 


ACT 


GAA 


TTT 


GCA 


AAA 


ACA 


TGT 


GTT 


GCT 


GAT 


GAG 


TCA 


GCT 


GAA 


AAT 


TGT 


GAC 


AAA 


TCA 


CTT 


CAT 


ACC 


CTT 


TTT 


GGA 


GAC 


AAA 


TTA 


TGC 


ACA 


GTT 


GCA 


ACT 


CTT 


CGT 


GAA 


ACC 


TAT 


GGT 


GAA 


ATG 


GCT 


GAC 


TGC 


TGT 


GCA 


AAA 


CAA 


GAA 


CCT 


GAG 


AGA 


AAT 


GAA 


TGC 


TTC 


TTG 


CAA 


CAC 


AAA 


GAT 


GAC 


AAC 


CCA 


AAC 


CTC 


CCC 


CGA 


TTG 


GTG 


AGA 


CCA 


GAG 


GTT 


GAT 


GTG 


ATG 


TGC 


ACT 


GCT 


TTT 


CAT 


GAC 


AAT 


GAA 


GAG 


ACA 


TTT 


TTG 


AAA 


AAA 


TAC 


TTA 


TAT 


GAA 


ATT 


GCC 


AGA 


AGA 


CAT 


CCT 


TAC 


TTT 


TAT 


GCC 


CCG 


GAA 


CTC 


CTT 


TTC 


TTT 


GCT 


AAA 


AGG 


TAT 


AAA 


GCT 


GCT 


TTT 


ACA 


GAA 


TGT 


TGC 


CAA 


GCT 


GCT 


GAT 


AAA 


GCT 


GCC 


TGC 


CTG 


TTG 


CCA 


AAG 


CTC 


GAT 


GAA 


CTT 


CGG 


GAT 


GAA 


GGG 


AAG 


GCT 


TCG 


TCT 


GCC 


AAA 


CAG 


AGA 


CTC 


AAG 


TGT 


GCC 


AGT 


CTC 


CAA 


AAA 


TTT 


GGA 


GAA 


AGA 


GCT 


TTC 


AAA 


GCA 


TGG 


GCG 


GTG 


GCT 


CGC 


CTG 


AGC 


CAG 


AGA 


TTT 


CCC 


AAA 


GCT 


GAG 


TTT 


GCA 


GAA 


GTT 


TCC 


AAG 


TTA 


GTG 


ACA 


GAT 


CTT 


ACC 


AAA 


GTC 


CAC 


ACG 


GAA 


TGC 


TGC 


CAT 


GGA 


GAT 


CTG 


CTT 


GAA 


TGT 


GCT 


GAT 


GAC 


AGG 


GCG 


GAC 


CTT 


GCC 


AAG 


TAT 


ATC 


TGT 


GAA 


AAT 


CAA 


GAT 


TCG 


ATC 


TCC 


AGT 


AAA 


CTG 


AAG 


GAA 


TGC 


TGT 


GAA 


AAA 


CCT 


CTG 


TTG 


GAA 


AAA 


TCC 


CAC 


TGC 


ATT 


GCC 


GAA 


GTG 


GAA 


AAT 


GAT 


GAG 


ATG 


CCT 


GCT 


GAC 


TTG 


CCT 


TCA 


TTA 


GCT 


GCT 


GAT 


TTT 


GTT 


GAA 


AGT 


AAG 


GAT 


GTT 


TGC 


AAA 


AAC 


TAT 


GCT 


GAG 


GCA 


AAG 


GAT 


GTC 


TTC 


CTG 


GGC 


ATG 


TTT 
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TTG 


TAT 


GAA 


TAT 


GCA 


AGA 


AGG 


CAT 


CCT 


GAT 


TAG 


TCT 


GTC 

vj X w 


GTG 

O X o 


CTG 


CTG 


CTC 


AGA 


CTT 


GCC 


AAG 


ACA 


TAT 
xt\x 


GAA 


APP 


APT 

AW X 


PTA 
WX A 


GAG 


A AG 


TCP 


TOT 


fiPP 






\9i\X 


PPT 


PAT 




xuU 


TAT 

lAx 




AAA 


GTG 




GAT 


GAA 




AAA 


PPT 
X 


PPT 




GAA 




X 


CAla 


AA X 


TTA 


ATP 


AAA 


PAA 


AAT 
txnx 


X\3X 




L> X X 


TTT 
X X X 






r*TT 

Ux X 


uuA 


uAu 


TAG 


AAA 


TTC 


CAG 


AAT 


GCG 


CTA 
w xn 


TTA 
X xrv 


GTT 

w X X 


CGT 

X 


TAP 
Xnl« 


APP 


AAG 
AAvj 


A A A 

AAA 


GTA 


CCC 


CAA 


GTG 


TCA 


ACT 


CCA 


ACT 


CTT 

^ X X 


GTA 
ox ri 


GAG 


GTP 


TP A 

XV* A 


AGA 

AvaA 


AAC 


CTA 


GGA 


AAA 


GTG 


GGC 


AGC 


AAA 


TGT 

X\9X 


TGT 

X VAX 


AAA 

AAA 


PAT 
WAX 


PPT 
WVX 


GAA 
OA A 


GCA 


AAA 


AGA 


ATG 


CCC 


TGT 


GCA 


GAA 


GAC 


TAT 


CTA 


TCC 
xww 


GTG 

V3 XV9 


GTP 
ox w 


CTG 


AAC 


CAG 


TTA 


TGT 


GTG 


TTC3 
X xu 


PAT 
wnx 


GAG 


AAA 

AAA 


APG 

AWw 


CPA 

WwA 


GTA 
13 X A 


AGT 

AO X 


GAC 


AGA 


GTC 


ACC 


AAA 


TGC 


TGC 


ACA 


GAA 


TCC 


TTG 


GTG 


AAC 


AGG 


CGA 


CCA 


TGC 


TTT 


TCA 


GCT 


CTG 


GAA 


GTC 


GAT 


GAA 


ACA 


TAC 


GTT 


CCC 


AAA 


GAG 


TTT 


AAT 


GCT 


GAA 


ACA 


TTC 


ACC 


TTC 


CAT 


GCA 


GAT 


ATA 


TGC 


ACA 


CTT 


TCT 


GAG 


AAG 


GAG 


AGA 


CAA 


ATC 


AAG 


AAA 


CAA 


ACT 


GCA 


CTT 


GTT 


GAG 


CTC 


GTG 


AAA 


CAC 


AAG 


CCC 


AAG 


GCA 


ACA 


AAA 


GAG 


CAA 


CTG 


AAA 


GCT 


GTT 


ATG 


GAT 


GAT 


TTC 


GCA 


GCT 


TTT 


GTA 


GAG 


AAG 


TGC 


TGC 


AAG 


GCT 


GAC 


GAT 


AAG 


GAG 


ACC 


TGC 


TTC 


GCC 


GAG 


GAG 


GGT 


AAA 


AAA 


CTT 


GTT 


GCT 


GCA 


AGT 


CAA 


GCT 


GCC 



TTA GGC TTA TAA CATCTACATTTAAAAGCATCTCAGCCTACCATGAGAATA 
AGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTCTTTTTCGTTGGTG 
TTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAA 
TCTAA 

wherein the 5' to 3' strand, beginning with the aunino 
terminus is shown, and wherein the abbreviations are 
defined as in claim 3. 

8. A process for preparing a plasmid encoding hiunan 
serum albumin which comprises inserting a deoxy- 
ribonucleotide sequence coding for human serum albumin 
into a plasmid having the capability of replication 
in a prokaryotic or eukaryotic organism. 

9. A process as claimed in claim 8 wherein the 
deoxyribonucleot ide sequence coding for human serum 
albumin is prepared by a process as claimed in 

any one of claims 1 to 7 , 
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lb 

10. A process as claimed in claim 8 or claim 9 
wherein the deoxyribonucleotide sequence coding for 
human serum alubmin is "inserted into a plasmid 
having the capability of replication in a prokaryotic 
organism of the genus Escherichia . 

11. The process of claim 10 wherein the deoxyribo- 
nucleotide sequence of claim 8 is inserted at the 
Pst I site of plasmid pBR322 so as to prepcure plasmid 
pGX401. 

12. A process for preparing a microorganism containing 
a gene coding for human serum albumin which comprises 
transforming a microorganism with a plasmid capable 

of replicating in said microorgemism and incl\iding 
said gene. 

13. A process as claimed in claim 12 wherein said 
plasmid is prepared by a process as claimed in any 
one of claims 8 to 11. 

14 . A microorganism transformed by a plasmid 
containing a deoxyribonucleotide sequence as defined 
in any one of claims 3 to 7 . 

15 . A microorganism as claimed in claim 14 of 
the genus Escherichia . 

16. A method of producing prepro-human serum 
albumin which comprises cultivating on an aqueous 
nutrient medium containing assimilable sources of 
carbon, nitrogen and essential minerals and growth 
factors, under prepro-human serum albumin-producing 
conditions, a prokaryotic organism transformed by 

a plasmid capable of replicating in said organism 
and having a deoxyribonucleotide sequence coding for 
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n 

prepro-human serum albumin, and recovering the prepro- 
human serum albximin so produced. 

17. A method as claimed in claim 16 wherein the 
prokaryotic organism is coli . 

5 18, A method as claim in claim 17 wherein the 

prokaryotic organism is transformed by a plasmid 
substantially similar to plasmid pGX401. 
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19, coli strain NRRL No. 15784 {pGX401), or a 

mutant thereof containing a humeui prepro-human serum 
albumin gene . 
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Figure 2 

Complete Nucleotide Sequence of the 
HSA Insert In Clone pGX401 



1< 

5' Met Lys Trp Val 

TCTCTTCTGTCAACCCCACGCCTTTGGCACA ATG AAG TGG GTA 

pre HSA >| 



Thr 
ACC 


Phe 
TTT 


He 
ATT 


Ser 
TCC 


Leu 
CTT 


Leu 
CTT 


Phe 
TTT 


Leu 
CTC 


Phe 
TTT 


Ser 
AGC 


Ser 
TCG 


Ala 
GCT 


Tyr 
TAT 


Ser 
TCC 


|< 




pro 
Val 
GTG 


HSA 
Phe 
TTT 




>| 


















Arg 
AGG 


Gly 
GGT 


Arg 
CGT 


Arg 
CGA 


Asp 
GAT 


Ala 
GCA 


His 
CAC 


Lys 
AAG 


Ser 
AGT 


Glu 
GAG 


Val 
GTT 


Ala 
GCT 


His 
CAT 


Arg 
CGG 


Phe 
TTT 


Lys 
AAA 


Asp 
GAT 


Leu 
TTG 


Gly 
GGA 


Glu 
GAA 


Glu 
GAA 


Asn 
AAT 


Phe 
TTC 


Lys 
AAA 


Ala 

GCC 


Leu 
TTG 


Val 
GTG 


Leu 
TTG 


He 
ATT 


Ala 
GCC 


Phe 
TTT 


Ala 
GCT 


Gin 
CAG 


Tyr 
TAT 


Leu 
CTT 


Gin 
CAG 


Gin 
CAG 


Cys 
TGT 


Pro 
CCA 


Phe 
TTT 


Glu 
GAA 


Asp 
GAT 


His 
CAT 


Val 
GTA 


Lys 
AAA 


Leu 
TTA 


Val 

GTC 


Asn 
AAT 


Glu 
GAA 


Val 
GTA 


Thr 
ACT 


Glu 
GAA 


Phe 
TTT 


Ala 
GCA 


Lys 
AAA 


Thr 
ACA 


Cys 
TGT 


Val 
GTT 


Ala 
GCT 


Asp 
GAT 


Glu 
GAG 


Ser 
TCA 


Ala 
GCT 


Glu 
GAA 


Asn 
AAT 


Cys 
TGT 


Asp 
GAC 


Lys 
AAA 


Ser 
TCA 


Leu 
CTT 


His 
CAT 


Thr 
ACC 


Leu 
CTT 


Phe 
TTT 


Gly 
GGA 


Asp 
GAC 


Lys 
AAA 


Leu 
TTA 


Cys 
TGC 


Thr 
ACA 


Val 
GTT 


Ala 
GCA 


Thr 
ACT 


Leu 
CTT 


Arg 
CGT 


Glu 
GAA 


Thr 
ACC 


Tyr 
TAT 


Gly 
GGT 


Glu 
GAA 


Met 
ATG 


Ala 
GCT 


Asp 
GAC 


Cys 
TGC 


Cys 
TGT 


Ala 
GCA 


Lys 
AAA 


Gin 
CAA 


Glu 
GAA 


Pro 
CCT 


Glu 
GAG 


Arg 
AGA 


Asn 
AAT 


Glu 
GAA 


Cys 
TGC 


Phe 
TTC 


Leu 
TTG 


Gin 
CAA 


His 
CAC 


Lys 
AAA 


Asp 
GAT 


Asp 
GAC 


Asn 
AAC 


Pro 
CCA 


Asn 
AAC 


Leu 
CTC 


Pro 
CCC 


Arg 
CGA 


Leu 
TTG 


Val 
GTG 


Arg 
AGA 


Pro 
CCA 


Glu 
GAG 


Val 
GTT 


Asp 
GAT 


Val 
GTG 


Met 
ATG 


Cys 
TGC 


Thr 
ACT 


Ala 
GCT 


Phe 
TTT 


His 
CAT 


Asp 
GAC 


Asn 
AAT 


Glu 
GAA 


Glu 
GAG 


Thr 
ACA 


Phe 
TTT 


Leu 
TTG 


Lys 
AAA 


Lys 
AAA 


Tyr 
TAC 


Leu 
TTA 


Tyr 
TAT 


Glu 
GAA 


lie 
ATT 


Ala 
GCC 


Arg 
AGA 


Arg 
AGA 


His 
CAT 


Pro 
CCT 


Tyr 
TAC 



Phe Thr Ala Pro Glu Leu Leu Phe Phe Ala Lys Arg Tyr Lys 
TTT TAT GCC CCG GAA CTC CTT TTC TTT GCT AAA AGG TAT AAA 
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Figure 2 (continued) 



Ala 
GCT 


Ala 
GCT 


Phe 
TTT 


Thr 
ACA 


Glu 
GAA 


Cys 
TGT 


Cys 
TGC 


Cys 
TGC 


Leu 
CTG 


Phe 
TTG 


Pro 
CCA 


Lys 
AAG 


Leu 
CTC 


Asp 
GAT 


Ala 
GCT 


Ser 
TCG 


Ser 
TCT 


Ala 
GCC 


Lys 
AAA 


Gin 
CAG 


Arg 
AGA 


Lys 
AAA 


Phe 
TTT 


Gly 
GGA 


Glu 
GAA 


Arg 
AGA 


Ala 
GCT 


Phe 
TTC 


Leu 
CTG 


Ser 
AGC 


Gin 
GAG 


Arg 
AGA 


Phe 
TTT 


Pro 
CCC 


Lys 
AAA 


Lys 
AAG 


Phe 
TTA 


Val 
GTG 


Thr 
ACA 


Asp 
GAT 


Leu 
CTT 


Thr 
ACC 


His 
CAT 


Gly 
GGA 


Asp 
GAT 


Leu 
CTG 


Leu 
CTT 


Glu 
GAA 


Cys 
TGT 


Ala 
GCC 


Lys 
AAG 


Tyr 
TAT 


He 
ATC 


Cys 
TGT 


Glu 
GAA 


Asn 
AAT 


Leu 
CTG 


Lys 
AAG 


Glu 
GAA 


Cys 
TGC 


Cys 
TGT 


Glu 
GAA 


Lys 
AAA 


Cys 
TGC 


He 
ATT 


Ala 
GCC 


Glu 
GAA 


Val 
GTG 


Glu 
GAA 


Asn 
AAT 


Pro 
CCT 


Ser 
TCA 


Phe 
TTA 


Ala 
GCT 


Val 
GCT 


Asp 
GAT 


Phe 
TTT 


Lys 
AAA 


Asn 
AAC 


Tyr 
TAT 


Ala 
GCT 


Glu 
GAG 


Ala 
GCA 


Lys 
AAG 


Phe 
TTG 


Tyr 
TAT 


Glu 
GAA 


TAT 


Ala 
GCA 


Arg 
AGA 


Arg 
AGG 


Leu 
CTG 


Leu 
CTG 


Leu 
CTG 


Arg 
AGA 


Leu 
CTT 


Ala 
GCC 


Lys 
AAG 


Lys 
AAG 


Cys 
TGC 


Cys 
TGT 


Ala 
GCC 


Ala 
GCT 


Ala 
GCA 


Asp 
GAT 


Val 
GTG 


Phe 
TTC 


Asp 
GAT 


Glu 
GAA 


Phe 
TTT 


Lys 
AAA 


Pro 
CCT 



Ala 
CAA 


Gin 
GCT 


Ala 
GCT 


Asp 
GAT 


Lys 
AAA 


Ala 
GCT 


Ala 
GCC 


Glu 
GAA 


Leu 
CTT 


Arg 
CGG 


Asp 
GAT 


Glu 
GAA 


Gly 
GGG 


Lys 
AAG 


Leu 
CTC 


Lys 
AAG 


Cys 
TGT 


Ala 
GCC 


Ser 
AGT 


Leu 
CTC 


Gin 
CAA 


Lys 
AAA 


Ala 
GCA 


Trp 
TGG 


Ala 
GCG 


Val 
GTG 


Ala 
GCT 


Arg 
CGC 


Ala 
GCT 


Glu 
GAG 


Phe 
TTT 


Ala 

GCA 


Glu 
GAA 


Val 
GTT 


Ser 
TCC 


Lys 
AAA 


Val 
GTC 


His 
CAC 


Thr 
ACG 


Glu 
GAA 


Cys 
TGC 


Cys 
TGC 


Ala 
GCT 


ASD 
GAT 


ASD 
GAC 


ArQ 
AGG 


Ala 
GCG 


Asp 
GAC 


Leu 
CTT 


Gin 
CAA 


Asp 
GAT 


Ser 
TCG 


He 
ATC 


Ser 
TCG 


Ser 
AGT 


Lys 
AAA 


Pro 
CCT 


Leu 
CTG 


Phe 
TTG 


Glu 
GAA 


LVS 
AAA 


Ser 
TCC 


His 
CAC 


Asp 
GAT 


Glu 
GAG 


Met 
ATG 


Pro 
CCT 


Ala 
GCT 


Asp 
GAC 


Phe 
TTG 


Val 
GTT 


Glu 
GAA 


Ser 
AGT 


Lys 
AAG 


Asp 
GAT 


Val 
GTT 


Cys 
TGC 


Asp 
GAT 


Val 
GTC 


Phe 
TTC 


Leu 
CTG 


Gly 
GGC 


Met 
ATG 


Phe 
TTT 


His 
CAT 


Pro 
CCT 


Asp 
GAT 


Tyr 
TAC 


Ser 
TCT 


Val 
GTC 


Val 
GTG 


Thr 
ACA 


Tyr 
TAT 


Glu 
GAA 


Thr 
ACC 


Thr 
ACT 


Leu 
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Figure 2 (continued) 



Tyr Lys Phe Gin Asn Ala Leu Phe Val Arg Tyr Thr Lys Lys 
TAG AAA TTC CAG AAT GCG CTA* TTA GTT CGT TAG AGG AAG AAA 

Val Pro Gin Leu Ser Thr Pro Thr Leu Val Glu Val Ser Arg 
GTA GCG CAA GTG TGA AGT GGA ACT CTT GTA GAG GTC TCA AGA 

Asn Leu Gly Lys Val Gly Ser Lys Gys Gys Lys His Pro Glu 
AAC GTA GGA AAA GTG GGG AGG AAA TGT TGT AAA GAT GGT GAA 

Ala Lys Arg Met Pro Gys Ala Glu Asp Tyr Leu Ser Val Val 
GGA AAA AGA ATG GGG TGT GGA GAA GAG TAT GTA TCG GTG GTG 

Leu Asn Gin Leu Gys Val Leu His Glu Lys Thr Pro Val Ser 
GTG AAG GAG TTA TGT GTG TTG CAT GAG AAA AGG CCA GTA AGT 

Asp Arg Val Thr Lys Cys Gys Thr Glu Ser Leu Val Asn Arg 
GAG AGA GTG ACC AAA TGG TGC AGA GAA TCC TTG GTG AAG AGG 

Arg Pro Gly Phe Ser Ala Leu Glu Val Asp Glu Thr Tyr Val 
GGA CCA TGG TTT TCA GGT GTG GAA GTG GAT GAA AGA TAG GTT 

Pro Lys Glu Phe Asn Ala Glu Thr Phe Thr Phe His Ala Asp 
GCG AAA GAG TTT AAT GCT GAA AGA TTC AGG TTC CAT GGA GAT 

He Cys Thr Leu Ser Glu Lys Glu Arg Gin He Lys Lys Glu 
ATA TGG ACA CTT TGT GAG AAG GAG AGA CAA ATG AAG AAA GAA 

Thr Ala Leu Val Glu Leu Val Lys His Lys Pro Lys Ala Thr 
AGT GGA CTT GTT GAG GTG GTG AAA GAG AAG GCG AAG GCA ACA 

Lys Glu Glu Leu Lys Ala Val Met Asp Asp Phe Ala Ala Phe 
AAA GAG CAA GTG AAA GGT GTT ATG GAT GAT TTC GGA GGT TTT 

Val Glu Lys Gys Cys Lys Ala Asp Asp Lys Glu Thr Cys Phe 
GTA GAG AAG TGC TGC AAG GCT GAG GAT AAG GAG ACC TGC TTG 

Ala Glu Glu Gly Lys Lys Leu Val Ala Ala Ser Glu Ala Val 
GCG GAG GAG GGT AAA AAA CTT GTT GGT GCA AGT GAA GCT GCG 

Leu Gly Leu STOP 

TTA GGC TTA TAA CATCTACATTTAAAAGCATCTGAGCCTACCATGAGAATA 
AGAGAAAGAAAATGAAGATCAAAAGCTTATTCATCTGTTTTCTTTTTCGTTGGTG 
TTTTAATCATTTTGCCTCTTTTCTCTGTGCTTCAATTAATAAAAAATGGAAAGAA 
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